Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Chris Bitmead <chris(at)bitmead(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org, pgsql-sql(at)postgreSQL(dot)org
Subject: Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Date: 2000-01-25 03:45:49
Message-ID: 11547.948771949@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-sql
Chris Bitmead <chris(at)bitmead(dot)com> writes:
> Tom Lane wrote:
>> If I don't hear loud hollers very soon, I'm going to eliminate the
>> DISTINCT ON "feature" for 7.0.  As previously discussed, this feature
>> is not standard SQL and has no clear semantic interpretation.

> I don't feel overly strongly about this, but if I remember right you can
> do some pretty cool things with this feature, provided you do define
> some semantics clearly.

We did talk about that, but I didn't hear any strong support for doing
it, as opposed to pulling the feature completely... in particular,
I didn't hear anyone volunteering to do the work...

> as long as it's useful, how about clearly defining it? I don't know that
> there is an easy way of doing this in standard SQL. I don't see any
> problems with useful extensions to SQL.

The only reason it came to my notice in the first place was people
posting questions asking why they weren't getting the results they
expected from it (whatever the heck those were; they weren't what you
actually get from the current implementation, anyway).  The problem
with a poorly-specified nonstandard feature is support costs: you
have to document it, answer questions about it, keep it working, etc.
In this case we'd also have to define how it should work and alter
the existing code to produce reasonable and predictable results.  The
existing code is not merely unpredictable, it is definitely broken.
For example:

regression=# select q1,q2 from int8_tbl;
        q1        |        q2
------------------+-------------------
              123 |               456
              123 |  4567890123456789
 4567890123456789 |               123
 4567890123456789 |  4567890123456789
 4567890123456789 | -4567890123456789
(5 rows)

regression=# select distinct on q1 q1,q2 from int8_tbl;
        q1        | q2
------------------+-----
              123 | 456
 4567890123456789 | 123
(2 rows)

-- OK so far, but:

regression=# select distinct on q1 q1,q2 from int8_tbl order by q2;
        q1        |        q2
------------------+-------------------
 4567890123456789 | -4567890123456789
              123 |               456
 4567890123456789 |  4567890123456789
(3 rows)

-- which is not "distinct on q1" by my notions...


In short, it's not clear to me that supporting DISTINCT ON is a good use
of our limited resources.  I'm willing to pull it out, but not to fix it.
Does someone else want to take responsibility for it?

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2000-01-25 03:55:02
Subject: Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Previous:From: Mike MascariDate: 2000-01-25 03:31:37
Subject: Re: [HACKERS] Happy column dropping

pgsql-sql by date

Next:From: Bruce MomjianDate: 2000-01-25 03:55:02
Subject: Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Previous:From: Chris BitmeadDate: 2000-01-25 03:09:50
Subject: Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group