Brian Cox <brian(dot)cox(at)ca(dot)com> writes:
> Kevin Grittner [Kevin(dot)Grittner(at)wicourts(dot)gov] wrote:
>> Is there any correlation between ts_id and ts_interval_start_time?
> only vaguely: increasing ts_interval_start_time implies increasing ts_id
> but there may be many rows (100,000's) with the same ts_interval_start_time
That's the problem then. Notice what the query plan is doing: it's
scanning the table in order by ts_id, looking for the first row that
falls within the ts_interval_start_time range. Evidently this
particular range is associated with smaller ts_ids, so you reach it a
lot sooner in a ts_id ascending scan than a ts_id descending one.
Given the estimated size of the range, scanning with the
ts_interval_start_time index wouldn't be much fun either, since it would
have to examine all rows in the range to determine the min or max ts_id.
You could possibly twiddle the cost constants to make the planner choose
that plan instead, but it's still not going to be exactly speedy.
Some experimentation suggests that it might help to provide a 2-column
index on (ts_id, ts_interval_start_time). This is still going to be
scanned in order by ts_id, but it will be possible to check the
ts_interval_start_time condition in the index, eliminating a large
number of useless trips to the heap. Whether this type of query is
important enough to justify maintaining an extra index for is something
you'll have to decide for yourself...
regards, tom lane
In response to
pgsql-performance by date
|Next:||From: Tom Lane||Date: 2009-06-19 14:38:52|
|Subject: Re: 8.4 COPY performance regression on Solaris |
|Previous:||From: Dave Dutcher||Date: 2009-06-19 14:14:37|
|Subject: Re: select max() much slower than select min()|