Re: 8.4.7, incorrect estimate

From: Wayne Conrad <wconrad(at)yagni(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 8.4.7, incorrect estimate
Date: 2011-05-02 16:14:41
Message-ID: 4DBED871.4010306@yagni.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 05/02/11 08:11, Tom Lane wrote:
> Wayne Conrad<wconrad(at)yagni(dot)com> writes:
>> On 04/29/11 12:12, Kevin Grittner wrote:
>>> Out of curiosity, what do you get with?:
>>>
>>> explain analyze
>>> select
>>> page_number,
>>> ps_id,
>>> ps_page_id
>>> from ps_page p
>>> where exists
>>> (
>>> select * from documents_ps_page d
>>> where d.ps_page_id = p.ps_page_id
>>> and exists
>>> (select * from temp_document_ids t
>>> where t.document_id = d.document_id)
>>> )
>>> order by ps_page_id
>
>> Merge Semi Join (cost=186501.69..107938082.91 rows=29952777 width=12)
>> (actual time=242801.828..244572.318 rows=5 loops=1)
>> Merge Cond: (p.ps_page_id = d.ps_page_id)
>> -> Index Scan using ps_page_pkey on ps_page p
>> (cost=0.00..2995637.47 rows=86141904 width=12) (actual
>> time=0.052..64140.510 rows=85401688 loops=1)
>> -> Index Scan using documents_ps_page_ps_page_id_idx on
>> documents_ps_page d (cost=0.00..104384546.06 rows=37358320 width=4)
>> (actual time=161483.657..163254.131 rows=5 loops=1)
>> Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
>> SubPlan 1
>> -> Seq Scan on temp_doc_ids t (cost=0.00..1.35 rows=1
>> width=0) (never executed)
>> Filter: (document_id = $0)
>> SubPlan 2
>> -> Seq Scan on temp_doc_ids t (cost=0.00..1.34 rows=5
>> width=35) (actual time=0.005..0.007 rows=5 loops=1)
>> Total runtime: 244572.432 ms
>> (11 rows)
>
> [ pokes at that ... ] I think what you've got here is an oversight in
> the convert-EXISTS-to-semijoin logic: it pulls up the outer EXISTS but
> fails to recurse on it, which would be needed to convert the lower
> EXISTS into a semijoin as well, which is what's needed in order to get
> a non-bogus selectivity estimate for it.
>
> I'll take a look at fixing that, but not sure if it'll be reasonable to
> back-patch or not. In the meantime, you need to look into restructuring
> the query to avoid nesting the EXISTS probes, if possible.
>
> regards, tom lane
>

Tom,

Thanks for looking at this. FYI, the same problem occurs when nesting
"where ... in (...)" (see start of thread, or I can repost it if you
want). In any case, I can make the problem go away by using another
layer of temporary table to avoid the nesting. That's what I'll do for now.

I'm not worried about back-patches to fix this in 8.4. We'll be
upgrading this box to 9 at some point; we'll just pick up any fix when
it hits 9.

Best Regards,
Wayne Conrad

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mark 2011-05-02 20:54:06 Re: Query improvement
Previous Message Kevin Grittner 2011-05-02 16:04:56 Re: 8.4.7, incorrect estimate