Re: ORDER BY and DISTINCT ON

From: Neil Conway <neilc(at)samurai(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: ORDER BY and DISTINCT ON
Date: 2003-12-14 03:42:39
Message-ID: 87ekv8qd3k.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> Do you really want:
>
> select distinct on (b,c,a) a,b,c from abc order by b,c,a;
>
> or is that you want
>
> select * from (select distinct on (a) a,b,c order by a) order by
> b,c,a;

If I understand you correctly, I don't think I would expect either.

- ORDER BY provides a sort order for the result set

- DISTINCT ON specifies a list of expressions, and says: "For each
set of rows in the result set for which these expressions are all
equal, retain the first row and throw the rest away", where the
"first row" is defined by the ORDER BY sort order

So I'd expect this query to

(a) keep at most one of every distinct 'a' value. When throwing out
duplicates, we should keep the row that would come first as
specified by the ORDER BY sort order

(b) sort the result set by b,c,a

ISTM this interpretation is pretty logical, and that the current
restriction is made purely for the sake of ease-of-implementation. If
that's the case, we should at least document this restriction, and
perhaps plan on correcting it in the future.

-Neil

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dennis Bjorklund 2003-12-14 07:44:33 Function argument names
Previous Message Greg Stark 2003-12-14 03:24:23 Re: Walker/mutator prototype.