Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace

From: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-sql(at)postgresql(dot)org
Subject: Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Date: 2000-01-26 22:23:24
Message-ID: 3.0.5.32.20000127092324.03474eb0@mail.rhyme.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

At 19:34 26/01/00 +0100, Peter Eisentraut wrote:
>On 2000-01-24, Tom Lane mentioned:
>
>> If I don't hear loud hollers very soon, I'm going to eliminate the
>> DISTINCT ON "feature" for 7.0. As previously discussed, this feature
>> is not standard SQL and has no clear semantic interpretation.
>
>Our documents say that DISTINCT ON is equivalent to GROUP BY. I still
>don't see why that wouldn't be true. You can always rewrite
>
>select distinct on a a,b from test
>
>as
>
>select a, xxx(b) from test group by a
>
>where xxx is some aggregate function (presumably min or max).
>
>You can also rewrite
>
>select distinct on a a,b,c from test
>
>as
>
>select a, b, c from test group by a, b, c
>
>or using some aggregates here as well. At least you can control your
>results that way.

I only learned about DISTINCT ON in this discussion, but my impression is
that it has one advantage over GROUP BY, i that it produces a 'consistent'
tuple. ie. the tuple it produces is guaranteed to exist in the database,
whereas using GROUP BY with aggregates will not produce a 'real' row:

f1|f2|f3
--------
1 2 3
1 3 2
2 3 1
2 1 3
3 1 2
3 2 1

'select distinct on f1 f1,f2,f3 from <somewhere> order by f1, f2, f3' will
produce (I think):

f1|f2|f3
--------
1 2 3
2 1 3
3 1 2

...where each resulting tuple actually exists in the DB. I can't see a way
of doing this with aggreagates unless a 'FIRST' function is defined, but
that (as already discussed) will not obey the 'order by' clause. Maybe an
'ANY' aggregate would do it, but then it would not always produce the same
results.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.C.N. 008 659 498) | /(@) ______---_
Tel: +61-03-5367 7422 | _________ \
Fax: +61-03-5367 7430 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-01-26 22:33:58 Re: AW: AW: [HACKERS] Some notes on optimizer cost estimates
Previous Message Ricardo Coelho 2000-01-26 20:46:52 Re: OIDS (Re: [HACKERS] Well, then you keep your darn columns)

Browse pgsql-sql by date

  From Date Subject
Next Message Chris Bitmead 2000-01-26 22:53:29 Re: [HACKERS] Re: [SQL] DISTINCT ON: speak now or forever hold your peace
Previous Message Peter Bojanic 2000-01-26 20:28:04 Help understanding how indexes are used by the query optimizer