Re: Window Function "Run Conditions"

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Window Function "Run Conditions"
Date: 2021-07-01 09:17:22
Message-ID: CAFj8pRAxeJ=xC67rnpd-GQwzM9D-0KbQCuO5p-C0ZkbxZAHOew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 1. 7. 2021 v 11:11 odesílatel David Rowley <dgrowleyml(at)gmail(dot)com> napsal:

> It seems like a few too many years of an SQL standard without any
> standardised way to LIMIT the number of records in a result set caused
> various applications to adopt some strange ways to get this behaviour.
> Over here in the PostgreSQL world, we just type LIMIT n; at the end of
> our queries. I believe Oracle people did a few tricks with a special
> column named "rownum". Another set of people needed SQL that would
> work over multiple DBMSes and used something like:
>
> SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn
> <= 10;
>
> I believe it's fairly common to do paging this way on commerce sites.
>
> The problem with PostgreSQL here is that neither the planner nor
> executor knows that once we get to row_number 11 that we may as well
> stop. The number will never go back down in this partition.
>
> I'd like to make this better for PostgreSQL 15. I've attached a WIP
> patch to do so.
>
> How this works is that I've added prosupport functions for each of
> row_number(), rank() and dense_rank(). When doing qual pushdown, if
> we happen to hit a windowing function, instead of rejecting the
> pushdown, we see if there's a prosupport function and if there is, ask
> it if this qual can be used to allow us to stop emitting tuples from
> the Window node by making use of this qual. I've called these "run
> conditions". Basically, keep running while this remains true. Stop
> when it's not.
>
> We can't always use the qual directly. For example, if someone does.
>
> SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn
> = 10;
>
> then if we use the rn = 10 qual, we'd think we could stop right away.
> Instead, I've made the prosupport function handle this by generating a
> rn <= 10 qual so that we can stop once we get to 11. In this case we
> cannot completely pushdown the qual. It needs to remain in place to
> filter out rn values 1-9.
>
> Row_number(), rank() and dense_rank() are all monotonically increasing
> functions. But we're not limited to just those. COUNT(*) works too
> providing the frame bounds guarantee that the function is either
> monotonically increasing or decreasing.
>
> COUNT(*) OVER (ORDER BY .. ROWS BETWEEN CURRENT ROW AND UNBOUNDED
> FOLLOWING) is monotonically decreasing, whereas the standard bound
> options would make it monotonically increasing.
>
> The same could be done for MIN() and MAX(). I just don't think that's
> worth doing. It seems unlikely that would get enough use.
>
> Anyway. I'd like to work on this more during the PG15 cycle. I
> believe the attached patch makes this work ok. There are just a few
> things to iron out.
>
> 1) Unsure of the API to the prosupport function. I wonder if the
> prosupport function should just be able to say if the function is
> either monotonically increasing or decreasing or neither then have
> core code build a qual. That would make the job of building new
> functions easier, but massively reduce the flexibility of the feature.
> I'm just not sure it needs to do more in the future.
>
> 2) Unsure if what I've got to make EXPLAIN show the run condition is
> the right way to do it. Because I don't want nodeWindow.c to have to
> re-evaluate the window function to determine of the run condition is
> no longer met, I've coded the qual to reference the varno in the
> window node's targetlist. That qual is no good for EXPLAIN so had to
> include another set of quals that include the WindowFunc reference. I
> saw that Index Only Scans have a similar means to make EXPLAIN work,
> so I just followed that.
>

+1

this can be very nice feature

Pavel

>
> David
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2021-07-01 09:27:00 Re: Numeric multiplication overflow errors
Previous Message David Rowley 2021-07-01 09:11:21 Window Function "Run Conditions"