Skip site navigation (1) Skip section navigation (2)

Re: Query plan, nested EXISTS

From: Matt Daw <matt(at)shotgunsoftware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query plan, nested EXISTS
Date: 2012-09-28 22:56:10
Message-ID: CAA2LLOFkL2VepKAUsRUirfgJeT5P=SMUNntTOd194cR_1xooMA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi Tom, v9.2.1 looks good!

 Aggregate  (cost=420808.99..420809.00 rows=1 width=0) (actual
time=147.345..147.345 rows=1 loops=1)
   ->  Nested Loop Semi Join  (cost=0.00..420786.71 rows=8914 width=0)
(actual time=13.847..147.219 rows=894 loops=1)
         ->  Index Scan using notes_retirement_date_project on notes a
 (cost=0.00..67959.22 rows=12535 width=4) (actual time=13.811..71.741
rows=12469 loops=1)
               Index Cond: (project_id = 114)
         ->  Nested Loop Semi Join  (cost=0.00..28.14 rows=1 width=4)
(actual time=0.006..0.006 rows=0 loops=12469)
               ->  Index Scan using note_links_note on note_links b
 (cost=0.00..12.37 rows=1 width=8) (actual time=0.002..0.002 rows=1
loops=12469)
                     Index Cond: (note_id = a.id)
                     Filter: ((retirement_date IS NULL) AND
((entity_type)::text = 'Version'::text))
                     Rows Removed by Filter: 1
               ->  Index Scan using versions_pkey on versions c
 (cost=0.00..15.76 rows=1 width=4) (actual time=0.003..0.003 rows=0
loops=11794)
                     Index Cond: (id = b.entity_id)
                     Filter: ((retirement_date IS NULL) AND ((code)::text
~~* '%comp%'::text))
                     Rows Removed by Filter: 1
 Total runtime: 147.411 ms
(14 rows)

On Fri, Sep 28, 2012 at 2:47 PM, Matt Daw <matt(at)shotgunsoftware(dot)com> wrote:

> Hi Tom, thank you very much. I'll load these tables onto a 9.2 instance
> and report back.
>
> Matt
>
>
> On Fri, Sep 28, 2012 at 2:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Matt Daw <matt(at)shotgunsoftware(dot)com> writes:
>> > Howdy, I've been debugging a client's slow query today and I'm curious
>> > about the query plan. It's picking a plan that hashes lots of rows from
>> the
>> > versions table (on v9.0.10)...
>>
>> > EXPLAIN ANALYZE
>> > SELECT COUNT(*) FROM notes a WHERE
>> > a.project_id = 114 AND
>> > EXISTS (
>> >     SELECT 1 FROM note_links b
>> >     WHERE
>> >     b.note_id = a.id AND
>> >     b.entity_type = 'Version' AND
>> >     EXISTS (
>> >         SELECT 1 FROM versions c
>> >         WHERE
>> >         c.id = b.entity_id AND
>> >         c.code ILIKE '%comp%' AND
>> >         c.retirement_date IS NULL
>> >     ) AND
>> >     b.retirement_date IS NULL
>> > )
>>
>> I think the real problem here is that 9.0 is incapable of avoiding a
>> full table scan on "note_links", which means it doesn't really have any
>> better option than to do the inner EXISTS as a full-table semijoin.
>> This is because it can't push a.id down through two levels of join, and
>> because the semijoins don't commute, there's no way to get a.id into the
>> scan of note_links to pull out only the useful rows.  The hack with
>> LIMIT avoids this problem by preventing the inner EXISTS from being
>> treated as a full-fledged semijoin; but of course that hack leaves you
>> vulnerable to very bad plans if the statistics are such that a nestloop
>> join isn't the best bet for the inner EXISTS.
>>
>> The work I did for parameterized paths in 9.2 was intended to address
>> exactly this type of scenario.  I would be interested to know if 9.2
>> does this any better for you.
>>
>>                         regards, tom lane
>>
>
>

In response to

pgsql-performance by date

Next:From: Виктор ЕгоровDate: 2012-09-28 23:11:58
Subject: NestedLoops over BitmapScan question
Previous:From: Tom LaneDate: 2012-09-28 22:25:57
Subject: Re: Possible Performance Regression with Transitive Comparisons vs. Constants

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group