Skip site navigation (1) Skip section navigation (2)

Re: Mysterious query plan

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "John D(dot) Burger" <john(at)mitre(dot)org>
Cc: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mysterious query plan
Date: 2005-09-27 14:04:48
Message-ID: 11197.1127829888@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-general
"John D. Burger" <john(at)mitre(dot)org> writes:
> I mistyped a table name in the following query:

>    select * from gazContainers
> 	where gazPlaceID in (select gazPlaceID from tipsterAuxiliary);

> Turns out there's no gazPlaceID column in tipsterAuxiliary, so the  
> inner gazPlaceID was referring to the outer gazContainers table.  This  
> ran all night without finishing, before I discovered my mistake.  What  
> I can't figure out is just what the query planner came up with.  Can  
> someone explain it to me?  Is it effectively some sort of horrible  
> cross join?  Here's the terse query plan:

>    Seq Scan on gazcontainers  (cost=0.00..10830971486.25 rows=2935950  
> width=8)
>      Filter: (subplan)
>      SubPlan
>        ->  Seq Scan on tipsterauxiliary  (cost=0.00..3330.04 rows=143604  
> width=0)

Yup.  What you've got there is the naive implementation of "IN (SELECT...)",
ie, "for each row of gazContainers, run the SELECT over tipsterAuxiliary
and look for a match".  EXPLAIN isn't amazingly bright about displaying
subplan invocations, so you just see "subplan" and not anything more
specific.  (Somebody should try to fix that sometime.)

In this case, since the first row from the sub-SELECT will invariably
contain the same value the IN is seeking, we won't read any more than
one row from the sub-SELECT.  So the runtime wouldn't be proportional
to the product of the table sizes, but it would be proportional to the
size of gazContainers with some depressingly large multiplier
corresponding to the startup overhead for the sub-SELECT.

(I've seen people make the identical mistake with a NOT IN, and then
the runtime really is proportional to the product :-()

			regards, tom lane

In response to

pgsql-general by date

Next:From: boingerDate: 2005-09-27 14:07:41
Subject: Re: Performance woes relating to DISTINCT (I think)
Previous:From: Yonatan Ben-NesDate: 2005-09-27 13:56:42
Subject: Re: Slow query using LIMIT

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group