Re: Planner ignoring to use INDEX SCAN

From: Richard Huxton <dev(at)archonet(dot)com>
To: Ashish Karalkar <ashish_postgre(at)yahoo(dot)co(dot)in>
Cc: pggeneral <pgsql-general(at)postgresql(dot)org>, ashish(dot)karalkar(at)netcore(dot)co(dot)in
Subject: Re: Planner ignoring to use INDEX SCAN
Date: 2007-12-14 13:00:54
Message-ID: 47627E86.6020509@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ashish Karalkar wrote:
> Thanks Richard for your replay,
>
> here is the output..
>
> Richard Huxton <dev(at)archonet(dot)com> wrote: Ashish Karalkar wrote:
>> Richard Huxton wrote: Ashish Karalkar wrote:
>>> query which was taking seconds on the join of these two table
>>> suddenly started taking 20/25 min
>> Show the EXPLAIN ANALYSE of your problem query and someone will be able
>> to tell you why.
>>
>> Here is the output from explain analyse:
>
> Actually, this is the output from EXPLAIN not EXPLAIN ANALYSE. It
> doesn't show what actually happened, just what the planner thought was
> going to happen.
>
> Its from EXPLAIN ANALYSE

No it's not, because that shows an extra set of figures. If you see
below there is a cost for each stage and an actual time too.

EXPLAIN ANALYSE SELECT d.id, l.name FROM items.documents d JOIN
lookups.document_class l ON d.class=l.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.16..14.00 rows=162 width=19) (actual
time=0.100..0.392 rows=162 loops=1)
Hash Cond: ((d.class)::text = (l.id)::text)
-> Seq Scan on documents d (cost=0.00..10.62 rows=162 width=9)
(actual time=0.024..0.121 rows=162 loops=1)
-> Hash (cost=1.07..1.07 rows=7 width=20) (actual
time=0.029..0.029 rows=8 loops=1)
-> Seq Scan on document_class l (cost=0.00..1.07 rows=7
width=20) (actual time=0.008..0.015 rows=8 loops=1)
Total runtime: 0.506 ms
(6 rows)

> Are the row-estimates roughly accurate?
>
> Yes Row count of sms_new is approx. same

OK, that's good.

>> table structures are more or less same with delivery being parent and sms_new being child having index on deliveryid in both tables.
>>
>> HashAggregate (cost=6153350.21..6153352.38 rows=174 width=32)
>> -> Hash Join (cost=218058.30..6153259.97 rows=6016 width=32)
>> Hash Cond: ("outer".deliveryid = "inner".deliveryid)
>> -> Seq Scan on sms_new (cost=0.00..5240444.80 rows=138939341 width=8)
>> Filter: ((otid)::text !~~ 'ERROR%'::text)
>> -> Hash (cost=218057.87..218057.87 rows=174 width=32)
>
> Well, it knows that it's going to be expensive (cost=5240444.80). Since
> it thinks you'll only get 174 rows from the other side and 6016
> matching, I can't see how an index could be calculated as more expensive.
>
> Try issuing ENABLE seq_scan=off and re-running the EXPLAIN, let's see
> what cost that comes up with.
>
> here is explain out put after setting enable_seqscan=off
>
> HashAggregate (cost=27729224.21..27729226.21 rows=160 width=32)
> -> Nested Loop (cost=2534.67..27729143.31 rows=5393 width=32)
> -> Bitmap Heap Scan on delivery (cost=2094.41..216143.78 rows=160 width=32)

> -> Bitmap Heap Scan on sms_new (cost=440.26..171369.61 rows=46931 width=8)

Well, the estimated cost for this one is up to 27 million from the
previous 6 million. It's doing two bitmap scans and then the nested loop
which is what's pushing the cost up.

Can you post the query too?

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2007-12-14 13:01:42 Re: rewrite pl/pgsql functions to c - remote job
Previous Message Marek Lewczuk 2007-12-14 12:50:05 rewrite pl/pgsql functions to c - remote job