Re: [SQL] SELECT DISTINCT and ORDER BY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joerg Eriskat <Eriskat(at)icf(dot)med(dot)uni-muenchen(dot)de>
Cc: pgsql-sql(at)postgreSQL(dot)org
Subject: Re: [SQL] SELECT DISTINCT and ORDER BY
Date: 1999-09-15 14:26:37
Message-ID: 19160.937405597@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Joerg Eriskat <Eriskat(at)icf(dot)med(dot)uni-muenchen(dot)de> writes:
> I'm not sure it is a bug, so I'm asking the SQL gurus. I think
> there is a inconsistency with SELECT DISTINCT queries when used in
> combination with ORDER BY and column alias names (sorry for the rather
> long example):

Looks like a bug to me. This seems to be fixed in the current sources,
probably as a result of the changes I made to the representation of sort
ordering. What is happening in 6.5.* is that the system is failing to
notice that the order-by values are the same as the ones being called
out as DISTINCT, so it generates additional columns and then sorts on
those instead of the DISTINCT ones. 6.6-to-be isn't fooled by the
aliases.

The underlying problem is still there, though: ordering by a column
not mentioned in the DISTINCT list doesn't do the right thing.
Arguably there is no right thing, and the code should raise an error,
but as of now it definitely isn't doing anything reasonable. It would
take two sort steps to handle this: sort by the DISTINCT columns,
filter out adjacent duplicates, then sort by the ORDER BY columns.
But we only do one sort, and we use the ORDER BY columns (plus any
DISTINCT columns not already in ORDER BY).

However, there's more to this than meets the eye. If you examine any
non-DISTINCT column after the filter step, then you are going to get a
randomly chosen one of the values associated with that set of DISTINCT
values. So a second sort step on such a column after the DISTINCT
sort/filter would not give well-defined results. That's why I think
maybe we should raise an error. (I don't like "DISTINCT ON column"
either, since it is *necessarily* ill-defined.)

regards, tom lane

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 1999-09-15 14:34:57
Previous Message Tom Lane 1999-09-15 14:00:02 Re: [HACKERS] Permission problem with COPY FROM