The only connection to MVCC is that the "obvious" solution doesn't work,
namely storing a cache of the aggregate in the table information.
So what would it take to implement this for "all" aggregates? Where I think
"all" really just means min(), max(), first(), last().
I think it would mean having a way to declare when defining an aggregate that
only specific records are necessary. For first() and last() it would only have
to indicate in some way that only the first or last record of the grouping was
necessary in the pre-existing order.
For min() and max() it would have to indicate not only that only the first or
last record is necessary but also the sort order to impose.
Then if the optimizer determines that all the aggregates used either impose no
sort order or impose compatible sort orders, then it should insert an extra
sort step before the grouping, and flag the executor to indicate it should do
DISTINCT ON type behaviour to skip unneeded records.
Now the problem I see is if there's no index on the sort order imposed, and
the previous step wasn't a merge join or something else that would return the
records in order then it's not necessarily any faster to sort the records and
return only some. It might be for small numbers of records, but it might be
faster to just read them all in and check each one for min/max the linear way.
In response to
pgsql-performance by date
|Next:||From: Josh Berkus||Date: 2003-09-09 17:14:03|
|Subject: Re: slow plan for min/max|
|Previous:||From: Vivek Khera||Date: 2003-09-09 16:08:55|
|Subject: Re: increase performancr with "noatime"?|