Skip site navigation (1) Skip section navigation (2)

Re: Bad query optimisation

From: Greg Stark <gsstark(at)mit(dot)edu>
To: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
Cc: <pgsql-hackers(at)postgresql(dot)org>, "Greg Stark" <gsstark(at)mit(dot)edu>
Subject: Re: Bad query optimisation
Date: 2002-11-30 23:23:56
Message-ID: 87u1hywsj7.fsf@stark.dyndns.tv (view raw or flat)
Thread:
Lists: pgsql-hackers
"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au> writes:

> > This is weird, it seems like min and max aren't being optimised
> > symmetrically. It seems like both of these should result in similar plans
> > and run equally fast. Instead the first is actually really slow and the
> > second is perfectly quick.
> 
> Without knowing anything about your data, if Postgres knows (from its stats
> tables) that 90% of the values in your column are above 'K0C1N2' then it
> will of course do a seq scan for the second query.

Oops, you're right that was a bad diagnosis. When I use the midpoint of the
data set they both get optimized into the same plan.

However I still think there's something wrong. It looks like postgres doesn't
know that it's possible to calculate min and max without scanning every
record.

When I run the queries below there's a big variance between the 2.3s required
to find the minimum for the whole dataset, and the .098s required to find the
maximum of the small subset below K0C1N2. Because there's an index on
postalcode it should be possible to find the minimum of any range in a single
index lookup.

It seems like this should be an important optimization given the number of
applications that request max(foo) in a broken attempt to implement sequences.
Occasionally it's not even a broken attempt too.

Incidentally, this is Postgres 7.2. Is this improved in 7.3?


bash-2.05b$ time psql -d salesoutlook -c "select min(postalcode) from postalcodes"
  min   
--------
 K0A1A0
(1 row)


real	0m2.334s
user	0m0.040s
sys	0m0.010s

bash-2.05b$ time psql -d salesoutlook -c "select max(postalcode) from postalcodes where postalcode < 'K0C1N2'"
  max   
--------
 K0C1N0
(1 row)


real	0m0.098s
user	0m0.030s
sys	0m0.020s

bash-2.05b$ time psql -d salesoutlook -c "select max(postalcode) from postalcodes where postalcode < 'L9J1J2'"
  max   
--------
 L9J1J1
(1 row)


real	0m2.128s
user	0m0.030s
sys	0m0.010s

-- 
greg


In response to

Responses

pgsql-hackers by date

Next:From: owDate: 2002-11-30 23:39:48
Subject: Re: 7.4 Wishlist
Previous:From: Christopher Kings-LynneDate: 2002-11-30 22:57:05
Subject: Re: Bad query optimisation

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group