From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Neil Conway <neilc(at)samurai(dot)com> |
Cc: | Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: ORDER BY and DISTINCT ON |
Date: | 2003-12-15 03:00:14 |
Message-ID: | 5280.1071457214@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Neil Conway <neilc(at)samurai(dot)com> writes:
> Does the non-determinism you're referring to result from an ORDER BY
> on a non-deterministic expression, or the non-determinism that results
> from picking an effectively random row because the ORDER BY isn't
> sufficient?
The latter --- you don't know which row you'll get, because it depends
on what the sorting procedure does with equal keys. (I think. This
argument was a few years ago and I've not bothered to review the
archives.) With ordinary DISTINCT this does not matter because you
can't tell the difference between "equal" rows anyway --- but with
DISTINCT ON, you can tell the difference.
> Which seems like an unconvincing justification for rejecting the
> query: we accept DISTINCT ON with no ORDER BY clause at all, for
> example.
Well, we invent an ORDER BY clause matching the DISTINCT ON in that
case. The rationale for doing so is weak, I agree, but since you have
not specified a sort order, you can hardly argue that the result is
wrong.
I think you are correct that this restriction is essentially an
efficiency hack. But DISTINCT ON is in itself an efficiency hack.
I'm not sure I see the point of allowing a less-efficient variation
of the efficiency hack, which is what we'd have if we supported
DISTINCT ON with a non-matching ORDER BY. Certainly it doesn't seem
important enough to expend significant implementation effort on.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2003-12-15 03:14:48 | Re: fork/exec patch |
Previous Message | Alvaro Herrera | 2003-12-15 02:51:23 | Re: Resurrecting pg_upgrade |