Re: query becomes fas on 'SET enable_hashjoin TO off;'

From: Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: query becomes fas on 'SET enable_hashjoin TO off;'
Date: 2009-02-10 15:52:39
Message-ID: a97c77030902100752u47d16e8bx1a57f36af6abcc7d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Feb 10, 2009 at 9:09 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com> writes:
>> On Tue, Feb 10, 2009 at 6:36 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I'm guessing that the problem is that the selectivity estimate for
>>> co_name_vec @@ to_tsquery('plastic&tubes') is not very good, but I'm
>>> not real familiar with full text search, so I'm not sure whether
>>> there's anything sensible you can do about it.
>
> Yeah, the bad selectivity estimate seems to be the entire problem ---
> if that were even slightly closer to reality the planner would've
> preferred the nestloop.
>
> I don't think there's a good solution to this in 8.3,

this is 8.2 server at the moment.

>because its
> estimator for @@ is just a stub. There will be a non-toy estimator
> in 8.4, fwiw.
>
> A possibility that seems a bit less crude than turning off hashjoins
> is to reduce random_page_cost, so as to bias things toward nestloop
> indexscans in general.
reducing random_page_cost from 4 (default) to 3 does switch the plan
in favour of nested loop thanks for the suggestion.

SET random_page_cost TO 4;
SET
tradein_clients=> explain select lead_id from general.trade_leads
join general.profile_master as pm using(profile_id) where status ='m'
and co_name_vec @@ to_tsquery('plastic&tubes') limit 20;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Limit (cost=4109.11..11127.78 rows=20 width=4)
-> Hash Join (cost=4109.11..90789.72 rows=247 width=4)
Hash Cond: (trade_leads.profile_id = pm.profile_id)
-> Seq Scan on trade_leads (cost=0.00..85752.52 rows=246832 width=8)
Filter: ((status)::text = 'm'::text)
-> Hash (cost=4095.68..4095.68 rows=1074 width=4)
-> Bitmap Heap Scan on profile_master pm
(cost=40.89..4095.68 rows=1074 width=4)
Filter: (co_name_vec @@ '''plastic'' & ''tube'''::tsquery)
-> Bitmap Index Scan on
profile_master_co_name_vec (cost=0.00..40.62 rows=1074 width=0)
Index Cond: (co_name_vec @@ '''plastic'' &
''tube'''::tsquery)
(10 rows)

tradein_clients=> SET random_page_cost TO 3;
SET
tradein_clients=> explain select lead_id from general.trade_leads
join general.profile_master as pm using(profile_id) where status ='m'
and co_name_vec @@ to_tsquery('plastic&tubes') limit 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9944.78 rows=20 width=4)
-> Nested Loop (cost=0.00..122818.07 rows=247 width=4)
-> Index Scan using profile_master_co_name_vec on
profile_master pm (cost=0.00..3256.28 rows=1074 width=4)
Index Cond: (co_name_vec @@ '''plastic'' & ''tube'''::tsquery)
Filter: (co_name_vec @@ '''plastic'' & ''tube'''::tsquery)
-> Index Scan using trade_leads_profile_id on trade_leads
(cost=0.00..110.76 rows=45 width=8)
Index Cond: (trade_leads.profile_id = pm.profile_id)
Filter: ((status)::text = 'm'::text)
(8 rows)

>
> regards, tom lane
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message justin 2009-02-10 18:07:23 Re: explanation of some configs
Previous Message Tom Lane 2009-02-10 15:39:18 Re: query becomes fas on 'SET enable_hashjoin TO off;'