Quick Links

Re: [HACKERS] distinct + order by

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	t-ishii(at)sra(dot)co(dot)jp (Tatsuo Ishii), hackers(at)postgreSQL(dot)org
Subject:	Re: [HACKERS] distinct + order by
Date:	1998-11-08 17:06:59
Message-ID:	19294.910544819@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I said:
> If we did want to make this example behave in a rational way, then
> probably the right implementation is something like
> * sort by i,j
> * distinct-filter on i only, being careful to keep first row
> in each set of duplicates
> * sort by j
> This would ensure that the final sort by j uses, for each distinct i,
> the lowest of the j-values associated with that i. This is a totally
> arbitrary decision, but at least it will give reproducible results.

Some closer probing with "explain verbose" shows that
"SELECT DISTINCT i FROM dtest ORDER BY j" is actually transformed
into this:

Unique on i,j (cost=1.10 size=0 width=0)
-> Sort by i,j (cost=1.10 size=0 width=0)
-> Seq Scan on dtest selecting i,j (cost=1.10 size=3 width=16)

This explains why you get the apparently duplicate i values --- they're
not duplicate when both i and j are considered.

It looks to me like someone tried to make the query tree builder deal
with this case in the way I suggest above, but didn't finish the job.
The "Unique" pass is being done on the wrong targets, and there's no
final sort.

regards, tom lane

Responses

Re: [HACKERS] distinct + order by at 1998-12-12 20:33:13 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Oleg Bartunov	1998-11-08 18:50:43	InterSystems CACHE' DB
Previous Message	Tom Lane	1998-11-08 16:21:49	Re: [HACKERS] regression tests