Skip site navigation (1) Skip section navigation (2)

Re: select max() much slower than select min()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brian Cox <brian(dot)cox(at)ca(dot)com>
Cc: "Kevin Grittner [Kevin(dot)Grittner(at)wicourts(dot)gov]" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: select max() much slower than select min()
Date: 2009-06-19 14:26:15
Message-ID: 5750.1245421575@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
Brian Cox <brian(dot)cox(at)ca(dot)com> writes:
> Kevin Grittner [Kevin(dot)Grittner(at)wicourts(dot)gov] wrote:
>> Is there any correlation between ts_id and ts_interval_start_time?

> only vaguely: increasing ts_interval_start_time implies increasing ts_id 
> but there may be many rows (100,000's) with the same ts_interval_start_time

That's the problem then.  Notice what the query plan is doing: it's
scanning the table in order by ts_id, looking for the first row that
falls within the ts_interval_start_time range.  Evidently this
particular range is associated with smaller ts_ids, so you reach it a
lot sooner in a ts_id ascending scan than a ts_id descending one.

Given the estimated size of the range, scanning with the
ts_interval_start_time index wouldn't be much fun either, since it would
have to examine all rows in the range to determine the min or max ts_id.
You could possibly twiddle the cost constants to make the planner choose
that plan instead, but it's still not going to be exactly speedy.

Some experimentation suggests that it might help to provide a 2-column
index on (ts_id, ts_interval_start_time).  This is still going to be
scanned in order by ts_id, but it will be possible to check the
ts_interval_start_time condition in the index, eliminating a large
number of useless trips to the heap.  Whether this type of query is
important enough to justify maintaining an extra index for is something
you'll have to decide for yourself...

			regards, tom lane

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2009-06-19 14:38:52
Subject: Re: 8.4 COPY performance regression on Solaris
Previous:From: Dave DutcherDate: 2009-06-19 14:14:37
Subject: Re: select max() much slower than select min()

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group