Re: select max() much slower than select min()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brian Cox <brian(dot)cox(at)ca(dot)com>
Cc: "Kevin Grittner [Kevin(dot)Grittner(at)wicourts(dot)gov]" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: select max() much slower than select min()
Date: 2009-06-19 14:26:15
Message-ID: 5750.1245421575@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Brian Cox <brian(dot)cox(at)ca(dot)com> writes:
> Kevin Grittner [Kevin(dot)Grittner(at)wicourts(dot)gov] wrote:
>> Is there any correlation between ts_id and ts_interval_start_time?

> only vaguely: increasing ts_interval_start_time implies increasing ts_id
> but there may be many rows (100,000's) with the same ts_interval_start_time

That's the problem then. Notice what the query plan is doing: it's
scanning the table in order by ts_id, looking for the first row that
falls within the ts_interval_start_time range. Evidently this
particular range is associated with smaller ts_ids, so you reach it a
lot sooner in a ts_id ascending scan than a ts_id descending one.

Given the estimated size of the range, scanning with the
ts_interval_start_time index wouldn't be much fun either, since it would
have to examine all rows in the range to determine the min or max ts_id.
You could possibly twiddle the cost constants to make the planner choose
that plan instead, but it's still not going to be exactly speedy.

Some experimentation suggests that it might help to provide a 2-column
index on (ts_id, ts_interval_start_time). This is still going to be
scanned in order by ts_id, but it will be possible to check the
ts_interval_start_time condition in the index, eliminating a large
number of useless trips to the heap. Whether this type of query is
important enough to justify maintaining an extra index for is something
you'll have to decide for yourself...

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-06-19 14:38:52 Re: 8.4 COPY performance regression on Solaris
Previous Message Dave Dutcher 2009-06-19 14:14:37 Re: select max() much slower than select min()