Re: distinct

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: "Otwell, Andrew (ISSAtlanta)" <AOtwell(at)iss(dot)net>, "'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: distinct
Date: 2000-10-10 02:58:25
Message-ID: 17265.971146705@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> writes:
> There really isn't a spec way to do it, however,
> select distinct on (col1) col1, col2, col3, col4 from tablename;
> would get you one row for each distinct col1, but it's undefined what
> row that gets you.

You can get a well-defined result with DISTINCT ON if you sort by
*all* the columns, because we specify that DISTINCT ON yields the
first row of each group that are the same in the DISTINCT-ON'd columns.
So for example,

select distinct on (col1, col2) col1, col2, col3, col4 from tablename
order by col1, col2, col3, col4 desc;

would give you the minimum col3, and then the maximum col4 for that col3
value, within the rows having any particular col1 & col2 values.

This is a big kluge, no question, but there are cases where it gives
you just what you want and there's no good way to get the same result
in standard SQL. For example, the above query is NOT equivalent to

select col1, col2, min(col3), max(col4) from tablename
group by col1, col2;

That'll give you the same col1, col2, and col3 results --- but the col4
maxima will be across the whole of each col1/col2 group, and will not
necessarily come from the same rows the col3 values come from.

There was a nice example of an application where this was Just The Right
Thing in the discussions that led up to DISTINCT ON being cleaned up,
rather than axed, for 7.0. It was either pgsql-general or pgsql-sql,
I forget, but check the archives for discussion of DISTINCT ON between
6.5 and 7.0 release dates.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Keith L. Musser 2000-10-10 03:14:48 Re: Re: JDBC Performance
Previous Message Fabrice Scemama 2000-10-10 02:57:49 Re: ODBC driver under the Windows 2000