Re: 8.4.7, incorrect estimate

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Wayne Conrad <wconrad(at)yagni(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 8.4.7, incorrect estimate
Date: 2011-05-02 15:11:58
Message-ID: 21729.1304349118@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Wayne Conrad <wconrad(at)yagni(dot)com> writes:
> On 04/29/11 12:12, Kevin Grittner wrote:
>> Out of curiosity, what do you get with?:
>>
>> explain analyze
>> select
>> page_number,
>> ps_id,
>> ps_page_id
>> from ps_page p
>> where exists
>> (
>> select * from documents_ps_page d
>> where d.ps_page_id = p.ps_page_id
>> and exists
>> (select * from temp_document_ids t
>> where t.document_id = d.document_id)
>> )
>> order by ps_page_id

> Merge Semi Join (cost=186501.69..107938082.91 rows=29952777 width=12)
> (actual time=242801.828..244572.318 rows=5 loops=1)
> Merge Cond: (p.ps_page_id = d.ps_page_id)
> -> Index Scan using ps_page_pkey on ps_page p
> (cost=0.00..2995637.47 rows=86141904 width=12) (actual
> time=0.052..64140.510 rows=85401688 loops=1)
> -> Index Scan using documents_ps_page_ps_page_id_idx on
> documents_ps_page d (cost=0.00..104384546.06 rows=37358320 width=4)
> (actual time=161483.657..163254.131 rows=5 loops=1)
> Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
> SubPlan 1
> -> Seq Scan on temp_doc_ids t (cost=0.00..1.35 rows=1
> width=0) (never executed)
> Filter: (document_id = $0)
> SubPlan 2
> -> Seq Scan on temp_doc_ids t (cost=0.00..1.34 rows=5
> width=35) (actual time=0.005..0.007 rows=5 loops=1)
> Total runtime: 244572.432 ms
> (11 rows)

[ pokes at that ... ] I think what you've got here is an oversight in
the convert-EXISTS-to-semijoin logic: it pulls up the outer EXISTS but
fails to recurse on it, which would be needed to convert the lower
EXISTS into a semijoin as well, which is what's needed in order to get
a non-bogus selectivity estimate for it.

I'll take a look at fixing that, but not sure if it'll be reasonable to
back-patch or not. In the meantime, you need to look into restructuring
the query to avoid nesting the EXISTS probes, if possible.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2011-05-02 15:18:52 Re: The right SHMMAX and FILE_MAX
Previous Message Wayne Conrad 2011-05-02 13:19:55 Re: 8.4.7, incorrect estimate