Re: GROUP BY vs DISTINCT

From: "Peter Childs" <peterachilds(at)gmail(dot)com>
To: "Postgresql Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: GROUP BY vs DISTINCT
Date: 2006-12-20 11:16:40
Message-ID: a2de01dd0612200316g92cc189jf3369ccedf1b8c12@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 20/12/06, Steinar H. Gunderson <sgunderson(at)bigfoot(dot)com> wrote:
> On Tue, Dec 19, 2006 at 11:19:39PM -0800, Brian Herlihy wrote:
> > Actually, I think I answered my own question already. But I want to
> > confirm - Is the GROUP BY faster because it doesn't have to sort results,
> > whereas DISTINCT must produce sorted results? This wasn't clear to me from
> > the documentation. If it's true, then I could save considerable time by
> > using GROUP BY where I have been using DISTINCT in the past. Usually I
> > simply want a count of the distinct values, and there is no need to sort
> > for that.
>
> You are right; at the moment, GROUP BY is more intelligent than DISTINCT,
> even if they have to compare the same columns. This is, as always, something
> that could be improved in a future release, TTBOMK.
>
> /* Steinar */

Oh so thats why group by is nearly always quicker than distinct. I
always thought distinct was just short hand for "group by same columns
as I've just selected"
Is it actually in the sql spec to sort in a distinct or could we just
get the parser to rewrite distinct into group by and hence remove the
extra code a different way of doing it must mean.?

Peter.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message CARMODA 2006-12-20 11:25:42 Question: Clustering & Load Balancing
Previous Message Steinar H. Gunderson 2006-12-20 11:00:07 Re: GROUP BY vs DISTINCT