Skip site navigation (1) Skip section navigation (2)

Re: query becomes fas on 'SET enable_hashjoin TO off;'

From: Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query becomes fas on 'SET enable_hashjoin TO off;'
Date: 2009-02-10 14:29:42
Message-ID: a97c77030902100629y14bc353cq531ac8ac397dfece@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Dear Robert,
thanks for ur interest. Our server was too loaded
what i posted my last observation, now the
other explain analyze can also be run and i am posting
both the result , as you can see latter is 55ms versus
3000 ms .

 explain analyze select lead_id from general.trade_leads join
general.profile_master as pm using(profile_id) where status ='m' and
co_name_vec @@ to_tsquery('plastic&tubes')  limit 20;

 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4109.11..11127.78 rows=20 width=4) (actual
time=3076.059..3076.059 rows=0 loops=1)
   ->  Hash Join  (cost=4109.11..90789.72 rows=247 width=4) (actual
time=3076.057..3076.057 rows=0 loops=1)
         Hash Cond: (trade_leads.profile_id = pm.profile_id)
         ->  Seq Scan on trade_leads  (cost=0.00..85752.52 rows=246832
width=8) (actual time=0.020..2972.446 rows=127371 loops=1)
               Filter: ((status)::text = 'm'::text)
         ->  Hash  (cost=4095.68..4095.68 rows=1074 width=4) (actual
time=42.368..42.368 rows=7 loops=1)
               ->  Bitmap Heap Scan on profile_master pm
(cost=40.89..4095.68 rows=1074 width=4) (actual time=42.287..42.360
rows=7 loops=1)
                     Filter: (co_name_vec @@ '''plastic'' & ''tube'''::tsquery)
                     ->  Bitmap Index Scan on
profile_master_co_name_vec  (cost=0.00..40.62 rows=1074 width=0)
(actual time=42.252..42.252 rows=7 loops=1)
                           Index Cond: (co_name_vec @@ '''plastic'' &
''tube'''::tsquery)
 Total runtime: 3076.121 ms
(11 rows)

tradein_clients=> SET enable_hashjoin TO off;
SET
tradein_clients=> explain analyze select lead_id from
general.trade_leads join general.profile_master as pm
using(profile_id) where status ='m' and co_name_vec @@
to_tsquery('plastic&tubes')  limit 20;

     QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3.42..13080.44 rows=20 width=4) (actual
time=55.233..55.233 rows=0 loops=1)
   ->  Nested Loop  (cost=3.42..161504.56 rows=247 width=4) (actual
time=55.232..55.232 rows=0 loops=1)
         ->  Index Scan using profile_master_co_name_vec on
profile_master pm  (cost=0.00..4335.36 rows=1074 width=4) (actual
time=16.578..46.175 rows=7 loops=1)
               Index Cond: (co_name_vec @@ '''plastic'' & ''tube'''::tsquery)
               Filter: (co_name_vec @@ '''plastic'' & ''tube'''::tsquery)
         ->  Bitmap Heap Scan on trade_leads  (cost=3.42..145.75
rows=47 width=8) (actual time=1.287..1.287 rows=0 loops=7)
               Recheck Cond: (trade_leads.profile_id = pm.profile_id)
               Filter: ((status)::text = 'm'::text)
               ->  Bitmap Index Scan on trade_leads_profile_id
(cost=0.00..3.41 rows=47 width=0) (actual time=1.285..1.285 rows=0
loops=7)
                     Index Cond: (trade_leads.profile_id = pm.profile_id)
 Total runtime: 55.333 ms
(11 rows)

 SELECT SUM(1) FROM general.trade_leads WHERE status = 'm';
  sum
--------
 127371

this constitutes 90% of the total rows.

regds
mallah.


On Tue, Feb 10, 2009 at 6:36 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Feb 10, 2009 at 5:31 AM, Rajesh Kumar Mallah
> <mallah(dot)rajesh(at)gmail(dot)com> wrote:
>> I have a query in which two huge tables (A,B) are joined using an indexed
>> column and a search is made on tsvector on some column on B. Very limited
>> rows of B are  expected to match the query on tsvector column.
>>
>> With default planner settings the query takes too long ( > 100 secs) , but
>> with hashjoin off  it returns almost immediately. The question is , is
>> it is advisable to
>> tweak planner settings for specific queries in application ?
>
> The ones that start with "enable_" usually shouldn't be changed.
> They're mostly for debugging and finding problems.
>
>> The plans are as follows.
>
> It's a little hard to figure out what's gone wrong here because you've
> only included EXPLAIN ANALYZE output for one of the plans - the other
> is just regular EXPLAIN.  Can you send that, along with the output of
> the following query:
>
> SELECT SUM(1) FROM trade_leads WHERE status = 'm'
>
> I'm guessing that the problem is that the selectivity estimate for
> co_name_vec @@ to_tsquery('plastic&tubes') is not very good, but I'm
> not real familiar with full text search, so I'm not sure whether
> there's anything sensible you can do about it.
>
> ...Robert
>

In response to

Responses

pgsql-performance by date

Next:From: Rajesh Kumar MallahDate: 2009-02-10 14:36:50
Subject: Re: ERROR: Can't use an undefined value as an ARRAY reference at /usr/lib/perl5/site_perl/5.8.8/Test/Parser/Dbt2.pm line 521.
Previous:From: Robert HaasDate: 2009-02-10 13:06:30
Subject: Re: query becomes fas on 'SET enable_hashjoin TO off;'

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group