Re: slow plan for min/max

From: "Matt Clark" <matt(at)ymogen(dot)net>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: slow plan for min/max
Date: 2003-09-10 00:22:09
Message-ID: LFEIJBEOKGPDHCEMDGNFOEHNCAAA.matt@ymogen.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> > Know what we (OK, I) need? An explicitly non-aggregate max() and min(),
> > implemented differently, so they can be optimised.
>
> Not per se. The way I've been visualizing this is that we add to
> pg_aggregate a column named, say, aggsortop, with the definition:
...snip of cunning potentially geralisable plan...
> How do you structure the resulting query plan, if it's at all complex
> (think multiple aggregate calls...)? I'm not clear on the answers to
> any of those questions, so I'm not volunteering to try to code it up ...

So, you're not going to code it, I'm not going to code it, I doubt anyone
else is soon.

The issue is going to remain then, that max() and min() are implemented in a
way that is grossly counterintuitively slow for 99% of uses. It's not bad,
or wrong, just a consequence of many higher level factors. This should
therefore be very prominently flagged in the docs until there is either a
general or specific solution.

FYI I have rewritten 4 queries today to work around this (with nice
performance benefits) as a result of this thread. Yeah, I should have
spotted the _silly_ seq scans beforehand, but if you're not looking, you
don't tend to see. Best improvement is 325msec to 0.60msec!

I'm happy to do the doc work.

M

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2003-09-10 03:25:23 Re: Hardware recommendations to scale to silly load
Previous Message Josh Berkus 2003-09-10 00:11:32 Re: Reading data in bulk - help?