Quick Links

Re: GROUP BY vs DISTINCT

From:	"Peter Childs" <peterachilds(at)gmail(dot)com>
To:	"Postgresql Performance" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: GROUP BY vs DISTINCT
Date:	2006-12-20 11:16:40
Message-ID:	a2de01dd0612200316g92cc189jf3369ccedf1b8c12@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 20/12/06, Steinar H. Gunderson <sgunderson(at)bigfoot(dot)com> wrote:
> On Tue, Dec 19, 2006 at 11:19:39PM -0800, Brian Herlihy wrote:
> > Actually, I think I answered my own question already. But I want to
> > confirm - Is the GROUP BY faster because it doesn't have to sort results,
> > whereas DISTINCT must produce sorted results? This wasn't clear to me from
> > the documentation. If it's true, then I could save considerable time by
> > using GROUP BY where I have been using DISTINCT in the past. Usually I
> > simply want a count of the distinct values, and there is no need to sort
> > for that.
>
> You are right; at the moment, GROUP BY is more intelligent than DISTINCT,
> even if they have to compare the same columns. This is, as always, something
> that could be improved in a future release, TTBOMK.
>
> /* Steinar */

Oh so thats why group by is nearly always quicker than distinct. I
always thought distinct was just short hand for "group by same columns
as I've just selected"
Is it actually in the sql spec to sort in a distinct or could we just
get the parser to rewrite distinct into group by and hence remove the
extra code a different way of doing it must mean.?

Peter.

In response to

Re: GROUP BY vs DISTINCT at 2006-12-20 11:00:07 from Steinar H. Gunderson

Responses

Re: GROUP BY vs DISTINCT at 2006-12-20 15:36:38 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	CARMODA	2006-12-20 11:25:42	Question: Clustering & Load Balancing
Previous Message	Steinar H. Gunderson	2006-12-20 11:00:07	Re: GROUP BY vs DISTINCT