Re: Add RANGE with values and exclusions clauses to the Window Functions

From: Pantelis Theodosiou <ypercube(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oliver Ford <ojford(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add RANGE with values and exclusions clauses to the Window Functions
Date: 2018-02-07 08:05:15
Message-ID: CAE3TBxx1tZGrsrCKb3XvixvgLLvfvP-3vrXLu8TZb2woi5Rojg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 4, 2018 at 6:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Oliver Ford <ojford(at)gmail(dot)com> writes:
> > [ 0001-window-frame-v13.patch ]
>
> I've been hacking on this all week (with breaks for release notes) and
> have gotten it into a state that I think is close to committable.
>
> There was quite a lot I didn't like about the patch initially, notably
> that the interaction with operator classes/families was done all wrong.
> The idea is to add one support function per opclass, not jam them all
> into one opclass that breaks every rule for B-tree opclasses. For one
> reason, with this approach there's no chance of dealing with non-default
> sort orders imposed by non-default opclasses. (As a concrete example,
> suppose that we have two btree opclasses for complex numbers, one that
> sorts by real part and one that sorts by imaginary part. You can write
> a well-defined in_range function for each of these, but one of them has
> to increment the real part and the other the imaginary part.) I whacked
> that around and also wrote the missing documentation for the API spec
> for in_range functions. The path of least resistance was to dump it
> into the nbtree/README text file, which I'm not that satisfied with;
> probably it should go in the main SGML docs, but I did not find a good
> place to put it.
>
> I also really didn't like the implementation you'd chosen in
> nodeWindowAgg.c to scan the entire partition and build an array of peer
> group lengths. That risks running OOM with a large partition. Even if
> the array doesn't lead to OOM, the tuplestore will spill to disk with
> nasty performance consequences. We should try to ensure that the
> tuplestore needn't get larger than the frame, so that well-written queries
> with narrow frames can execute without spilling to disk. So I rewrote
> that using an idea that had been speculated about in the original
> comments, but nobody had gotten to yet: add some more read pointers to
> track the frame boundaries, and advance them as needed. I'm not really
> sure if this ends up as more or few row comparisons than the other way,
> but in any case it uses a fixed amount of memory, which is good.
>
> Also, the last patch's reimplementation of WinGetFuncArgInFrame isn't
> right: AFAICS, it results in any "relpos" that would point to a row
> in the exclusion range returning the row just after/before that range,
> which is already wrong if the exclusion range is more than one row,
> plus it doesn't renumber the rows beyond the exclusion. The behavior
> we want is that the frame rows surviving after exclusion should appear
> consecutively numbered. (This could be exposed with some tests using
> nth_value.) I think the attached rewrite gets this right. Also, punting
> entirely on the set-mark problem for SEEK_TAIL cases doesn't satisfy me,
> for the same reason as above that we don't want the tuplestore to bloat.
> What I did below is to set the mark at the frame start, which at least
> gives an opportunity for efficient queries.
>
> I hacked around on various other things too, for instance the behavior
> for null values in RANGE mode didn't seem to be per spec.
>
> I'm reasonably happy with all the code now, though surely it could use
> another look by someone else. I've not yet reviewed the docs (other than
> the implementor-oriented details I added), nor have I really looked at the
> test cases. I do have a couple suggestions on the test cases: for one,
> rather than duplicating the same window definition N times in each query,
> use one WINDOW clause and reference it with "OVER windowname". Also,
> adding a bunch of columns of different types to a single table seems like
> a messy and not easily extensible way of testing different data types.
> I'd suggest leaving the existing table alone and adding a new test table
> per additional data type you want to test, so that there's an easy
> template for testing future additions of more in_range support.
>
> BTW, something I've not done here but am strongly tempted to do is
> run around and change all the uses of "RANGE value PRECEDING/FOLLOWING"
> terminology to, say, "RANGE offset PRECEDING/FOLLOWING". "value" is
> just way too generic a term for this situation, making documentation
> confusing, plus you end up contorting sentences to avoid constructions
> like "value of the value". I'm not wedded to "offset" if somebody's got a
> better word, but let's try to pick something more specific than "value".
> (In the ROWS and GROUPS cases, maybe write "count"? Not entirely sure
> what to do for text that's trying to address all three cases, though.)
>
>
What about "extent_size" or just "size"? I see the SQL spec refers to
"preceding or following size" in an error message: ("data exception —
invalid preceding or following size in window function" )

Best regards
Pantelis Theodosiou

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2018-02-07 08:23:00 In logical replication concurrent update of partition key creates a duplicate record on standby.
Previous Message Amit Khandekar 2018-02-07 07:11:25 Re: Query running for very long time (server hanged) with parallel append