Frames vs partitions: is SQL2008 completely insane?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Frames vs partitions: is SQL2008 completely insane?
Date: 2008-12-26 21:04:34
Message-ID: 15325.1230325474@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

According to SQL2008 section 7.11 <window clause>, general rule 5, the
default definition of window framing in a window that has an ordering
clause but no framing (RANGE/ROWS) clause is that the window frame for
a given row R runs from the first row of its partition through the last
peer of R.

Section 6.10's general rules define the results of LEAD, LAG,
FIRST_VALUE, LAST_VALUE, NTH_VALUE in terms of the rows available in the
window frame of the current window, not its partition.

Meanwhile, section 6.10 <window function> syntax rule 6 says that
LEAD/LAG must use a window that has an ordering clause and no
framing clause.

This means that without an explicit framing clause, none of these
functions can "look beyond" the last peer of the current row; and
what's worse, LEAD/LAG seem to be explicitly forbidden from looking
further than that even if we had an implementation of framing clauses.

This seems to be less than sane. I would certainly expect that LEAD(x)
gives you the next value of x regardless of peer-row status, since
LAG(x) gives you the prior value of x regardless of peer row status.
It is also simply bizarre for FIRST_VALUE to give you the partition's
first row when LAST_VALUE doesn't give you the partition's last row.

Are there any errata for SQL2008 yet? Can anyone check the actual
behavior of DB2 or other DBMS's that claim to implement these functions?

I notice that the current patch code seems to implement
first/last/nth_value using the frame, but lead/lag using the partition,
which doesn't conform to spec AFAICS ... but lead/lag on the frame
doesn't actually appear to be a useful definition so I'd rather go
with that than with what the letter of the spec seems to say.

Lastly, for a simple aggregate used with an OVER clause, the current
patch seems to define the aggregate as being taken over the frame
rather than the partition, but I cannot find anything in SQL2008 that
lends any support to *either* definition.

Comments? This all seems rather badly broken.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-12-26 21:30:20 Re: Window-functions patch handling of aggregates
Previous Message Robert Haas 2008-12-26 20:57:59 Re: Window-functions patch handling of aggregates