Re: Planning aggregates which require sorted or distinct

From: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Planning aggregates which require sorted or distinct
Date: 2007-01-20 13:19:26
Message-ID: Pine.LNX.4.58.0701202355350.29254@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 20 Jan 2007, Tom Lane wrote:

> Gavin Sherry <swm(at)alcove(dot)com(dot)au> writes:
> > We want to answer the following: for each employee: what is their rank in
> > terms of salary and what is their rank in terms of age. This query
> > answers that:
>
> > select empno, rank() over (order by salary) as srank,
> > rank() over (order by age) as arank
> > from employees order by empno;
>
> Eeek. This seems like the worst sort of action-at-a-distance. How does
> rank() know what value it's supposed to report the rank of?

This is a frustratingly inconsistent bit of the spec. Rank is defined as
follows:

RANK() OVER WNS is equivalent to:
( COUNT (*) OVER (WNS1 RANGE UNBOUNDED PRECEDING)
- COUNT (*) OVER (WNS1 RANGE CURRENT ROW) + 1 )

Say the salary column has the following values: {100, 200, 200, 300}. This
would give the following output: {1, 2, 2, 4}. DENSE_RANK() would give:
{1, 2, 2, 3}.

These functions are pretty ugly (if you think about them in terms of our
existing aggregates). However, they are by far the most heavily used
window functions (along with ROW_NUMBER()).

Thanks,

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-01-20 13:36:32 Re: Planning aggregates which require sorted or distinct
Previous Message Gavin Sherry 2007-01-20 12:54:51 Re: Planning aggregates which require sorted or distinct