Windowing Function Patch Review -> Standard Conformance

From: "David Rowley" <dgrowley(at)gmail(dot)com>
To: "'Hitoshi Harada'" <umi(dot)tanuki(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Cc: <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Windowing Function Patch Review -> Standard Conformance
Date: 2008-11-05 00:25:18
Message-ID: D4DF765C92DD40BDA5CBBD0BAB37AF87@amd64
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> All,
>
> This is my first patch review for PostgreSQL. I did submit a patch last
> commit fest (Boyer-Moore) so I feel I should review one this commit fest.
> I'm quite new to PostgreSQL so please don't rely on me totally. I'll do my
> best. Heikki is also reviewing this patch which makes me feel better.
>
> My aim is to get the author has much feed back as quickly as possible. For
> this reason I'm going to be breaking down my reviews into the following
> topics.
>
> 1. Does patch apply cleanly?
>
> 2. Any compiler warnings?
>
> 3. Do the results follow the SQL standard?
>
> 4. Performance Comparison, does it perform better than alternate ways of
> doing things. Self joins, sub queries etc.
>
> 5. Performance, no comparison. How does it perform with larger tables?

This thread covers part of 3.

Quoted from SQL:2008
"If CUME_DIST is specified, then the relative rank of a row R is defined as
NP/NR, where NP is defined
to be the number of rows preceding or peer with R in the window ordering of
the window partition of R
and NR is defined to be the number of rows in the window partition of R."

So let me create a quick test case...

create table employees (
id INT primary key,
name varchar(30) not null,
department varchar(30) not null,
salary int not null,
check (salary >= 0)
);

insert into employees values(1,'Jeff','IT',10000);
insert into employees values(2,'Sam','IT',12000);

insert into employees values(3,'Richard','Manager',30000);
insert into employees values(4,'Ian','Manager',20000);

insert into employees values(5,'John','IT',60000);
insert into employees values(6,'Matthew','Director',60000);

My interpretation of the standard should make the last two columns in the
following query equal, and they are in the patch.

SELECT name,CAST(r AS FLOAT) / c, cd
FROM (SELECT name,
ROW_NUMBER() OVER(ORDER BY salary) as r,
COUNT(*) OVER() AS c,
CUME_DIST() OVER(ORDER BY salary) AS cd
FROM employees
) t;

Both Oracle and Sybase say otherwise. Have I (we both) misinterpreted the
standard?

name,cast(t.r as real)/t.c,cd
'Jeff',0.16666666666666666,0.16666666666666666
'Sam',0.3333333333333333,0.3333333333333333
'Ian',0.5,0.5
'Richard',0.6666666666666666,0.6666666666666666
'John',0.8333333333333334,1.0
'Matthew',1.0,1.0

Above are the results from Sybase.

Can anyone see who is correct here? Is it possible that both Oracle and
Sybase have it wrong?

David.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Sitnikov 2008-11-05 01:02:57 Re: Windowing Function Patch Review -> Standard Conformance
Previous Message Guillaume Lelarge 2008-11-05 00:10:22 Re: Patch for ALTER DATABASE WITH TABLESPACE