Re: Window functions patch v04 for the September commit fest

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject: Re: Window functions patch v04 for the September commit fest
Date: 2008-09-03 02:34:24
Message-ID: e08cc0400809021934r1bdb8c67w973964d569955800@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2008/9/2 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
> Hitoshi Harada wrote:
>>
>> 2008/9/2 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
>> In my understanding, the "Window Frame" is defined
>> by clauses such like "ROWS BETWEEN ... ", "RANGE BETWEEN ... " or so,
>> contrast to "Window Partition" defined by "PARTITION BY" clause. A
>> frame slides within a partition or there's only one frame if those
>> clauses are not specified. The current patch has not implemented this
>> yet. I'll update the docs.
>
> Yes, that's how I read it as well. Another way to think of it is that
> there's always a window frame, but if no ROWS BETWEEN or similar clause is
> given, the window frame spans the whole partition (or from the 1st row of
> the partition, up to the current row, if ORDER BY is given).
>
>> I don't like to call the second type "ranking aggregates" because it
>> may refer to only ranking functions though there are more types of
>> function like ntile(), lead() and lag(). But "window functions"
>> doesn't seem appropriate either since it's ambiguous with the general
>> name of window expressions.
>
> Yep, confusing :-(. The SQL standard says that "a window function is one of:
> a rank function, a distribution function, a row number function, a window
> aggregate function, the ntile function, the lead function, the lag function,
> the first-value function, the last-value function, the nth-value function".
> So, window aggregate functions are a special class of window functions, and
> there's no term to refer to all the rest of the window functions excluding
> window aggregate functions.
>
> Your doc SQL spec
> Window expression Window function
> Window function Any window function other than a window aggregate
> function
> Window aggregate Window aggregate function
>
> I tried to coin the term "ranking aggregate" for the SQL2008 term "Any
> window function other than a window aggregate function", but you're right
> that that's still confusing, because the SQL2008 term "rank function"
> includes only RANK() and DENSE_RANK().
>
> The spec calls them "group aggregate functions", when they're used with
> GROUP BY, rather than as a window function. I think we could use that term.

Agree. So from now on, we use "window functions" for all kinds of
functions including window aggregates. "Window expression" is
discarded. "Window functions" also means the mechanism to support
these functions to process and this project.

>> Your proposal is smarter than the current implementation. But it
>> doesn't seem complete in some way. From logical point of view, the
>> window functions should be allowed to access whole of rows in the
>> frame the current row belongs to (c.f. inverse distribution
>> functions).
>
> By the whole of rows, do you mean
> a) the chosen value or expression of all rows, or
> b) all columns of the input rows?

a). I mean all input rows in a window frame. But later I found
"inverse distribution function" is not one of window functions. That
is actually one of aggregate functions. Forget about it.

>
> Different window functions have different needs. RANK() for example does
> need to see all columns, to compare them, but it only needs access to the
> previous and current row. CUME_DIST on the other hand needs access to all
> columns of all rows, and LAG needs access to a specific column of a fixed
> number of rows. And row_number needs nothing.
>
> The needs of access to the rows are so different that it seems best to me to
> delegate the buffering to the window function.

Delegating optimization to them depending on functions' needs is a
good idea. So executor can concentrate on the window function process
flow. Let's unify it in executor and let trivial optimizations get
into individual functions.

>
> Actually, looking closer to the ranking functions, they don't really need
> access to all columns. They just need to be able to compare them, according
> to the window ordering, so we should probably only provide access to the
> arguments of the aggregate, evaluated for any row in the frame, and a
> comparator function, that can determine if any given rows in the frame are
> <, = or >.

That is kind of problem. If your task is only to define that window
node executor simply stores window frame rows and pass them to window
functions as they need, the rank functions' needs don't come. As you
point out, rank functions need ordering key columns and its
comparators. So you have to imagine what comes next? What will be
wanted other than ordering key columns, if we think about "universe
window functions" much more than SQL spec says?

>
>> From implementing and performance point of view, there
>> need to consider such case like mixing window aggregates and ranking
>> aggregates in the same window, which means it is better that the two
>> types of functions are processed in the similar flow. Also logically,
>> SQL spec doesn't forbid ranking aggregates in sliding window frames.
>
> It doesn't? Oh, bummer. It seems we need a grand unified theory of ranking
> and other aggregates then :-(. How does something like RANK work with a
> window frame? What does it return, if the current row is excluded from the
> window frame, e.g with EXCLUDE CURRENT ROW?

It doesn't but implicitly does. Also it seems that the Oracle and
other RDMBSs do. Rank functions in sliding frame are possible because
if you take the current row the window frame can be defined and with
it aggregate. Also in implementing, without having catalog info about
which function can take sliding window, parser/planner/executor cannot
see how to forbid it. Have the info, otherwise we have to allow it.

> Let's look at the trivial, generic, and slow implementation first, and then
> figure out what tricks we can do to make it faster. I gather that that
> generic algorithm, following how the SQL spec defines the window frame,
> looks like this:

So as to satisfy all of window functions' needs, Window object does
well. But it is so heavy that we need to optimize as functions need,
by reducing stored rows and avoiding to call redundant functions.
Still I'm afraid if user can define such a complex function...

> Here's sketches of some aggregate implementations using this API:
>
> RANK()
> ------
> enterfunc:
> if position of the new row is > current row, do nothing. Otherwise,
> decrement current rank.
> exitfunc:
> if position of the new row is > current row, do nothing. Otherwise,
> increment current rank.

I don't see why the row's positions affect rank. The rank depends on
comparing two rows ordering columns, doen't it?

> I'd suggest:
>
> 1. Implement Window node, with the capability to invoke an aggregate
> function, using the above API. Implement required parser/planner changes.
> Implement a few simple ranking aggregates using the API.
> 2. Implement glue to call normal aggregates through the new API
> 3. Implement the optimization to drop rows that are no longer needed
> (signal_cutoff can be a no-op until this phase)
> 4. Implement window framing (the frame can always be all rows in the
> partition, or all rows until the current row, until this phase)
> 5. Expose the new API to user-defined aggregates. It can be an internal API
> only used by built-in functions until this phase
>
> I believe you already have phase 1 in your patch, except for the API
> changes.
>

I am willing to challenge to implement the API above, after maintain
the current patch adding docs and tests. Since the API includes
changes much more like Aggregate syntax than current patch, I'm not
sure if I can finish it by next commit fest, which is said to be
"feature freeze". For safety, remain the current patch to review
excluding API and executor then if I fail to finish use it for next
release. Git helps it by cutting a branch, does it? How do you think?

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2008-09-03 02:57:28 pg_typeof() (was: Mysterious Bus Error with get_fn_expr_argtype())
Previous Message Tom Lane 2008-09-03 02:05:19 Re: [gsmith@gregsmith.com: Re: [patch] GUC source file and line number]