Re: Support for RANGE ... PRECEDING windows in OVER

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: ian link <ian(at)ilink(dot)io>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-22 03:41:19
Message-ID: 51C51CDF.4060908@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/22/2013 03:30 AM, ian link wrote:
> Forgive my ignorance, but I don't entirely understand the problem. What
> does '+' and '-' refer to exactly?

Consider "RANGE 4.5 PRECEDING'.

You need to be able to test whether, for the current row 'b', any given
row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
< vs <= boundaries, but that's irrelevant for the example.

To test that, you have to be able to do two things: you have to be able
to test whether one value is greater than another, and you have to be
able to add or subtract a constant from one of the values.

Right now, the b-tree access method provides information on the ordering
operators < <= = > >= <> , which provides half the answer. But these
don't give any concept of *distance* - you can test ordinality but not
cardinality.

To implement the "different by 4.5" part, you have to be able to add 4.5
to one value or subtract it from the other.

The obvious way to do that is to look up the function that implements
the '+' or '-' operator, and do:

((OPERATOR(+))(a, 4.5)) > b AND (a <= b)

or

((OPERATOR(-))(b, 4.5)) < a AND (a <= b);

The problem outlined by Tom in prior discussion about this is that
PostgreSQL tries really hard not to assume that particular operator
names mean particular things. Rather than "knowing" that "+" is always
"an operator that adds two values together; is transitive, symmetric and
reflexive", PostgreSQL requires that you define an *operator class* that
names the operator that has those properties.

Or at least, it does for less-than, less-than-or-equals, equals,
greater-than-or-equals, greater-than, and not-equals as part of the
b-tree operator class, which *usually* defines these operators as < <= =
>= > <>, but you could use any operator names you wanted if you really
liked.

Right now (as far as I know) there's no operator class that lets you
identify operators for addition and subtraction in a similar way. So
it's necessary to either add such an operator class (in which case
support has to be added for it for every type), extend the existing
b-tree operator class to provide the info, or blindly assume that "+"
and "-" are always addition and subtraction.

For an example of why such assumptions are a bad idea, consider matrix
multiplication. Normally, "a * b" = "b * a", but this isn't true for
multiplication of matrices. Similarly, if someone defined a "+" operator
as an alias for string concatenation (||), we'd be totally wrong to
assume we could use that for doing range-offset windowing.

So. Yeah. Operator classes required, unless we're going to change the
rules and make certain operator names "special" in PostgreSQL, so that
if you implement them they *must* have certain properties. This seems
like a pretty poor reason to add such a big change.

I hope this explanation (a) is actually correct and (b) is helpful.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-06-22 03:50:52 Re: Support for REINDEX CONCURRENTLY
Previous Message Tom Lane 2013-06-22 03:19:27 Re: Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)