Re: big distinct clause vs. group by

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Uwe Bartels <uwe(dot)bartels(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: big distinct clause vs. group by
Date: 2011-04-23 19:34:11
Message-ID: 4B594466-0554-4AF8-A8BC-8FDA9C393DF9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Apr 18, 2011, at 1:13 PM, Uwe Bartels <uwe(dot)bartels(at)gmail(dot)com> wrote:
> Hi Robert,
>
> thanks for your answer.
> the aggregate function I was talking about is the function I need to use for the non-group by columns like min() in my example.
> There are of course several function to choose from, and I wanted to know which causes as less as possible resources.

Oh, I see. min() is probably as good as anything. You could also create a custom aggregate that just always returns its first input. I've occasionally wished we had such a thing as a built-in.

Another option is to try to rewrite the query with a subselect so that you do the aggregation first and then add the extra columns by joining against the output of the aggregate. If this can be done without joining the same table twice, it's often much faster, but it isn't always possible. :-(

...Robert

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-04-23 19:44:23 Re: REINDEX takes half a day (and still not complete!)
Previous Message Robert Haas 2011-04-23 19:24:28 Re: oom_killer