Re: Windowing Function Patch Review -> Standard Conformance

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "David Rowley" <dgrowley(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-11-24 13:41:43
Message-ID: e08cc0400811240541p296f051v9f3298b821e23e0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2008/11/24 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
> Hitoshi Harada wrote:
>>
>> 2008/11/20 David Rowley <dgrowley(at)gmail(dot)com>:
>>>
>>> I won't be around until Monday evening (Tuesday morning JST). I'll pickup
>>> the review again there. It's really time for me to push on with this
>>> review.
>>> The patch is getting closer to getting the thumbs up from me. I'm really
>>> hoping that can happen on Monday. Then it's over to Heikki for more code
>>> feedback.
>>
>> This time I only fixed some bugs and rebased against the HEAD, but did
>> not refactored. I can figure out some part of what Heikki claimed but
>> not all, so I'd like to see what he will send and post another patch
>> if needed.
>
> Thanks! Here's what I've got this far I merged your latest patch into this,
> as well as latest changes from CVS HEAD.
>
> It's a bit of a mess, but here's the big changes I've done:
> - Removed all the iterator stuff. You can just call WinGetPart/FrameGetArg
> repeatedly. It ought to be just as fast if done right in nodeWindow.c.
> - Removed all the Shrinked/Extended stuff, as it's not needed until we have
> full support for window frames.
> - Removed all the WinRow* functions, you can just call WinFrame/PartGet*
> functions, using WINDOW_SEEK_CURRENT
> - Removed WinFrame/PartGetTuple functions. They were only used for
> determining if two tuples are peer with each other, so I added a
> WinRowIsPeer function instead.
> - Removed all the buffering strategy stuff. Currently, the whole partition
> is always materialized. That's of course not optimal; I'm still thinking we
> should just read the tuples from the outer node lazily, on-demand, instead.
> To avoid keeping all tuples in the partition in tuplestore, when not needed,
> should add an extra function to trim the tuplestore.
>
> There's now a lot less code in nodeWindow.c, and I'm starting to understand
> most of what-s left :-).
>
> TODO
> - clean up the comments and other mess.
> - Modify WinPart/FrameGetArg so that it's passed the parameter number
> instead of an Expr node.
>
> I'll continue working on the executor, but please let me know what you
> think.
>

It is amazing changes and I like it. Actually I wanted to get it
slimer but hadn't found the way.

two points, frame concept and window_gettupleslot

If you keep this in your mind, please don't be annoyed but the current
frame concept is wrong.

-- yours
sample=# select depname, empno, salary, last_value(empno) over(order
by salary) from empsalary;
depname | empno | salary | last_value
-----------+-------+--------+------------
personnel | 5 | 3500 | 5
personnel | 2 | 3900 | 2
develop | 7 | 4200 | 7
develop | 9 | 4500 | 9
sales | 4 | 4800 | 4
sales | 3 | 4800 | 3
sales |b1 | 5000 | 1
develop | 11 | 5200 | 11
develop | 10 | 5200 | 10
develop | 8 | 6000 | 8
(10 rows)

-- mine
sample=# select depname, empno, salary, last_value(empno) over(order
by salary) from empsalary;
depname | empno | salary | last_value
-----------+-------+--------+------------
personnel | 5 | 3500 | 5
personnel | 2 | 3900 | 2
develop | 7 | 4200 | 7
develop | 9 | 4500 | 9
sales | 4 | 4800 | 3
sales | 3 | 4800 | 3
sales | 1 | 5000 | 1
develop | 11 | 5200 | 10
develop | 10 | 5200 | 10
develop | 8 | 6000 | 8
(10 rows)

Note that on empno=4 then last_value=4(yours)/3(mine), which means the
frame is applied to last_value() as well as nth_value() and
first_value(). Not only to aggregates. So these lines in nodeWindow.c,

/*
* If there is an ORDER BY (we don't support other window frame
* specifications yet), the frame runs from first row of the partition
* to the current row. Otherwise the frame is the whole partition.
*/
if (((Window *) winobj->winstate->ss.ps.plan)->ordNumCols == 0)
max_pos = winobj->partition_rows - 1;
else
max_pos = winobj->currentpos;

max_pos is bogus. RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
means max_pos would be currentpos + rows_peers. I looked over the
patch and aggregates care it with winobj->aggregatedupto but what
about other window functions?

Then window_gettupleslot looks like killing performance in some cases.
What if the partition has millions of rows and wfunc1 seeks the head
and wfunc2 seeks the tail?

So as you point it is possible to define frame and partition while
scanning current rows rather than before scanning, I felt it'd cost
more. But I know this is work in progress, so you probably think about
the solutions. What do you think?

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-11-24 13:43:09 Re: pgsql: Add support for matching wildcard server certificates to the new
Previous Message Zdenek Kotala 2008-11-24 13:40:46 Re: Minor race-condition problem during database startup