Re: Windowing Function Patch Review -> Standard Conformance

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "David Rowley" <dgrowley(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-11-26 12:29:42
Message-ID: e08cc0400811260429y74b78fbfv296deaaa8c3410a3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2008/11/26 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
> Hitoshi Harada wrote:
>>
>> I read more, and your spooling approach seems flexible for both now
>> and the furture. Looking at only current release, the frame with ORDER
>> BY is done by detecting peers in WinFrameGetArg() and add row number
>> of peers to winobj->currentpos. Actually if we have capability to
>> spool all rows we need on demand, the frame would be only a boundary
>> problem.
>
> Yeah, we could do that. I'm afraid it would be pretty slow, though, if
> there's a lot of peers. That could probably be alleviated with some sort of
> caching, though.
>
>> It seems to me that eval_windowaggregate() also should use frame APIs.
>> Only things we have to care is the shrinking frame, which is not
>> supported in this release. So I'd suggest winobj->aggregatedupto to be
>> removed. Is there objection?
>
> Actually, I took a different approach in the latest patch. Window aggregates
> now use a separate read pointer into the tuplestore. The current row is also
> read using a separate read pointer in ExecWindow. The aggregate and current
> row read pointers don't need random access, which has the nice effect that
> if you only use window aggregates, not window functions, the tuplestore
> doesn't need random access, and doesn't need to spill to disk as long as the
> window frame fits in work_mem.
>
> We should still figure out how to make it possible to trim the tuplestore,
> when window functions that don't need random access are used. Like
> ROW_NUMBER and RANK. Earlier, I thought we should add function to the window
> object API to explicitly tell that tuples before row X are no longer needed.
> But I'm now starting to wonder if we should design the window object API
> more like the tuplestore API, with a read pointer that you can advance
> forward or backward, and rewind. That would probably map more naturally to
> the underlying tuplestore, and it seems like it would be just as easy to use
> in all the existing window functions.
>

Complete solution, at least for the current release. I now figure out
exactly what the tuplestore_trim does. So currentrow pointer doesn't
need go backward, neither does extending frame's aggregate pointer,
row_number, rank, etc. Cutting off frame's aggregate need random row,
so does lead, lag, etc. Even there were random access, it's very
flexible in triming and saving memory. Little concern is some
operations like WinRowIsPeer() doesn't know if the required row is
trimmed already, which isn't big problem in the existing functions.

Now you might think about sharing aggregate code like
initialize/advance/finalize. If you want I can refactor in nodeAgg.c
to be able sharing with nodeWindow.c, which wouldn't conflict with
your work.

Regards,

--
Hitoshi Harada

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2008-11-26 12:32:07 Re: Review: Hot standby
Previous Message Sam Mason 2008-11-26 12:12:31 Re: Proposal for better PQExpBuffer out-of-memory behavior