From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
Cc: | Surafel Temesgen <surafel3000(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: FETCH FIRST clause WITH TIES option |
Date: | 2018-10-30 01:12:16 |
Message-ID: | 73660663-d1c6-12cc-200f-315616560a93@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/29/2018 05:47 PM, Andrew Gierth wrote:
>>>>>> "Tomas" == Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>
> >> I still think that this is the wrong approach. Implementing WITH
> >> TIES and PERCENT together using an implicit window function call
> >> kills two birds with one very small stone (the only executor change
> >> needed would be teaching LIMIT to be able to stop on a boolean
> >> condition), with maximum reuse of existing facilities.
>
> Tomas> Hmmm, maybe. How would that work, exactly? Wouldn't that mean
> Tomas> extra overhead (the window functions are hardly free) and
> Tomas> limitations?
>
> They're not free, but then this case probably shouldn't be treated as a
> particularly hot code path.
>
> The basic idea is this: extend the Limit node to be able to stop on a
> boolean expression, such that the first false value ends the result.
>
> Then FETCH FIRST N WITH TIES becomes "stop when the expression
> rank() over (order by ...) <= N
> is no longer true" (where the ... is the existing top level order by)
>
> and FETCH FIRST N PERCENT is the same but with percent_rank (or
> cume_dist, I'd have to check the exact semantics)
>
> rank() doesn't need to read ahead in the input. percent_rank of course
> does, but it's clearly impossible to implement PERCENT without doing
> that.
>
OK. I was under the impression the window function would need to see all
the input rows, but that seems not to be the case in general.
> Also, one could consider extending LIMIT to support arbitrary
> expressions, with some syntax like LIMIT WHEN (condition) which would be
> the general form.
>
Hmmm. I can't really recall needing such thing, but of course - that
does not prove it'd not be a good thing.
> Tomas> FWIW, I doubt the patch can be much smaller/simpler - a
> Tomas> significant part of the new stuff is in gram.y and node read/out
> Tomas> infrastructure, the changes to LIMIT node are fairly minimal.
>
> It's not a matter of patch size as such, but reuse of mechanisms rather
> than adding new ones. Also doing WITH TIES as a special case in the
> Limit node doesn't make adding PERCENT any easier at all, whereas the
> above gets it basically for free.
>
Makes sense, I guess. At first I was thinking that this certainly does
not add more new mechanisms than allowing LIMIT to terminate on boolean
expression. But you're right about the WITH PERCENT part - using window
functions would make adding this much simpler.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2018-10-30 01:33:18 | Re: partition tree inspection functions |
Previous Message | Amit Langote | 2018-10-30 00:53:54 | Re: Speeding up INSERTs and UPDATEs to partitioned tables |