Re: Windowing Function Patch Review -> Standard Conformance

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "David Rowley" <dgrowley(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-12-07 14:37:50
Message-ID: e08cc0400812070637w1b88abc6g1ed71005c395dba4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2008/12/6 David Rowley <dgrowley(at)gmail(dot)com>:
> Hitoshi Harada Wrote:
>> 2008/12/3 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
>> > I am randomly trying some issues instead of agg common code (which I
>> > now doubt if it's worth sharing the code), so tell me if you're
>> > restarting your hack again. I'll send the whole patch.
>> >
>>
>> Attached is the updated patch, including:
>>
>> - performance tuning up for large data sets
>> - move find_wfunc to optimizer/util/clauses.c
>> - rename WFunc to WindowFunc
>> - rename WinDef to WindowDef
>> - rename wfunc->pure_agg to winagg
>> - remove winstate->tail_ptr
>> - fix WinFrameGetArg in case that there are more than one peer, add
>> relevant test
>> - duplicate GetAggInitVal(), not sharing aggregate routines with nodeAgg.c
>>
>> I believe the remaining work is only to optimize row_number()/rank()
>> cases to trim tuplestore like window aggregates. This will be done by
>> providing some kind of way for each window functions to tell Window
>> node that it doesn't require backward_scan.

I've spent hours to try this issue, and concluded it doesn't pay.

First, the test is on this query:

explain analyze select id, row_number() OVER (order by id)
from bigtable order by id;

where "bigtable" has ~400MB, 10000000 rows.

This simple row_number() query takes about 50 sec, whereas without
row_number() indexscan query takes about 25 sec. I wondered what makes
the difference 25 sec.

With this test, the tuplestore dumps its tuples since it never trims.
Then I took profile of nodeWindow in some points,

tuplestore_puttupleslot 13.6 sec
spool_tuples 37.9 sec
eval_windowfunction 9.3 sec

Note that spool_tuples contains execProc(outerPlan), which means its
37.9 sec contains outer indexscan 25 sec, plus tuplestore_puttuple,
13.9 sec.

Then I modified some code to let tuplestore trim and tested again then
the results were:

tuplestore_puttupleslot 11.2 sec
spool_tuples 35.8 sec
eval_windowfunction 9.5 sec

It shows even though tuplestore trims its tuples and stays in memory
rather than dumps them on files, the performance up is only 2 sec in
50 sec. So I concluded the optimization for row_number()/rank() etc
doesn't pay for its more complexity in window function API. The
bottleneck of the Window node origins from something else, like
puttupleslot(), not Window node algorithm. Of course those issues
should be tracked more precisely, for the window functions work I
ignore them.

>
> It's been a long time since the CommitFest started. This patch has come a
> long way in that time. We've seen various performance improvements and many
> fixes where the patch was not matching the standard. We've also seen quite a
> big change in the window buffering technique which is showing amazing
> performance improvements in certain queries.
>
> I've spent many hours reading the standard and comparing the results from
> this patch against other RDBMS' that support window functions. I wonder if
> we're the first to implement NTH_VALUE()?.
>
> The patch, in my opinion is getting very close to being committable. The
> only outstanding issues; Please correct me where I'm wrong or have omitted
> something.
>
> * Fix still required for rank_up. Code still has a FIXME.
This was whether rank() without ORDER BY clause should be valid or
not. The answer is yes as it is implemented now, so I removed only
comments.

> * ROW_NUMBER() and RANK() performance to be looked into.
As I tested above.

> * Slight changes required in documentation (my previous email).
Applied to the patch.

> * Final code read by someone reviewing the code.
I am looking forward.

> I've also spent many hours trying to break this patch and in previous
> versions I've managed it. Last night I spent a few hours again testing this
> new patch and did not managed to find anything wrong. We're getting close to
>
> the time where the community can test further by committing this patch.
Agree. I'll send the latest patch and finish my work for now.

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hitoshi Harada 2008-12-07 14:41:14 Re: Windowing Function Patch Review -> Standard Conformance
Previous Message Hannu Krosing 2008-12-07 12:04:08 Re: user-based query white list