Window Function "Run Conditions"

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Window Function "Run Conditions"
Date: 2021-07-01 09:11:21
Message-ID: CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It seems like a few too many years of an SQL standard without any
standardised way to LIMIT the number of records in a result set caused
various applications to adopt some strange ways to get this behaviour.
Over here in the PostgreSQL world, we just type LIMIT n; at the end of
our queries. I believe Oracle people did a few tricks with a special
column named "rownum". Another set of people needed SQL that would
work over multiple DBMSes and used something like:

SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn <= 10;

I believe it's fairly common to do paging this way on commerce sites.

The problem with PostgreSQL here is that neither the planner nor
executor knows that once we get to row_number 11 that we may as well
stop. The number will never go back down in this partition.

I'd like to make this better for PostgreSQL 15. I've attached a WIP
patch to do so.

How this works is that I've added prosupport functions for each of
row_number(), rank() and dense_rank(). When doing qual pushdown, if
we happen to hit a windowing function, instead of rejecting the
pushdown, we see if there's a prosupport function and if there is, ask
it if this qual can be used to allow us to stop emitting tuples from
the Window node by making use of this qual. I've called these "run
conditions". Basically, keep running while this remains true. Stop
when it's not.

We can't always use the qual directly. For example, if someone does.

SELECT * FROM (SELECT ... row_number() over (order by ...) rn) a WHERE rn = 10;

then if we use the rn = 10 qual, we'd think we could stop right away.
Instead, I've made the prosupport function handle this by generating a
rn <= 10 qual so that we can stop once we get to 11. In this case we
cannot completely pushdown the qual. It needs to remain in place to
filter out rn values 1-9.

Row_number(), rank() and dense_rank() are all monotonically increasing
functions. But we're not limited to just those. COUNT(*) works too
providing the frame bounds guarantee that the function is either
monotonically increasing or decreasing.

COUNT(*) OVER (ORDER BY .. ROWS BETWEEN CURRENT ROW AND UNBOUNDED
FOLLOWING) is monotonically decreasing, whereas the standard bound
options would make it monotonically increasing.

The same could be done for MIN() and MAX(). I just don't think that's
worth doing. It seems unlikely that would get enough use.

Anyway. I'd like to work on this more during the PG15 cycle. I
believe the attached patch makes this work ok. There are just a few
things to iron out.

1) Unsure of the API to the prosupport function. I wonder if the
prosupport function should just be able to say if the function is
either monotonically increasing or decreasing or neither then have
core code build a qual. That would make the job of building new
functions easier, but massively reduce the flexibility of the feature.
I'm just not sure it needs to do more in the future.

2) Unsure if what I've got to make EXPLAIN show the run condition is
the right way to do it. Because I don't want nodeWindow.c to have to
re-evaluate the window function to determine of the run condition is
no longer met, I've coded the qual to reference the varno in the
window node's targetlist. That qual is no good for EXPLAIN so had to
include another set of quals that include the WindowFunc reference. I
saw that Index Only Scans have a similar means to make EXPLAIN work,
so I just followed that.

David

Attachment Content-Type Size
v1-0001-Allow-some-window-functions-to-finish-execution-e.patch application/octet-stream 49.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2021-07-01 09:17:22 Re: Window Function "Run Conditions"
Previous Message Kyotaro Horiguchi 2021-07-01 09:07:06 Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE