User defined window functions calling WinGetFuncArgInPartition/WINDOW_SEEK_HEAD

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: ojford(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: User defined window functions calling WinGetFuncArgInPartition/WINDOW_SEEK_HEAD
Date: 2025-10-09 06:08:58
Message-ID: 20251009.150858.1461274753395856075.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Oliver,

PostGis has user defined window functions. I checked them and found
they call WinGetFuncArgInPartition with WINDOW_SEEK_HEAD.
https://gitea.osgeo.org/postgis/postgis/search?q=WinGetFuncArgInPartition&fuzzy=true

If they decide to call the window functions with IGNORE NULLS option,
I think they may be surprised that
WinGetFuncArgInPartition/WINDOW_SEEK_HEAD returns somewhat
counter intuitive results.

To study this I created a tiny user defined window function which
calls WinGetFuncArgInPartition with WINDOW_SEEK_HEAD (see attached).

CREATE FUNCTION mywindowfunc(IN anycompatible, IN int)
RETURNS anycompatible
AS 'MODULE_PATHNAME', 'mywindowfunc'
LANGUAGE C WINDOW STRICT;

The first argument is an any expression and the second argument is
passed to WinGetFuncArgInPartition's relpos argument. Here is the
result:

CREATE TEMP TABLE g(x INT, y INT);
CREATE TABLE
INSERT INTO g (VALUES(NULL,1),(NULL,2),(10,3),(20,4));
INSERT 0 4
SELECT * FROM g;
x | y
----+---
| 1
| 2
10 | 3
20 | 4
(4 rows)

SELECT x, y, mywindowfunc(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | mywindowfunc
----+---+--------------
| 1 |
| 2 |
10 | 3 |
20 | 4 |
(4 rows)

SELECT x, y, mywindowfunc(x, 1) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | mywindowfunc
----+---+--------------
| 1 | 10
| 2 | 10
10 | 3 | 10
20 | 4 | 10
(4 rows)

SELECT x, y, mywindowfunc(x, 2) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | mywindowfunc
----+---+--------------
| 1 | 20
| 2 | 20
10 | 3 | 20
20 | 4 | 20
(4 rows)

For relpos==0, 1, 2 it returns NULL, 10, 20 on the first result row.

I can explain why for relpos==0, NULL is returned. If IGNORE NULLS and
relpos==0, it is treated as if RESPECT NULLS is specified. But this
may be counterintuitive result.

reslpos==1, 2 case also seems strange to me. I expected for relpos==1,
the second non null row (20) is returned, but it returns 10. For
relpos==2, I expected NULL since there's no third non null row. But
others may think differently.

Since there's no such an idea like WINDOW_SEEK_HEAD in the SQL
standard, I think we have to decide what to do with WINDOW_SEEK_HEAD
case. One idea is, we treat it as nth_value with "ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING" frame option.
With nth_value we get 10, 20 and NULL for offset 1, 2 and 3.

What do you think?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment Content-Type Size
mywindowfunc.tar.bz2 application/octet-stream 8.7 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-10-09 06:22:26 RE: Patch for migration of the pg_commit_ts directory
Previous Message Amit Kapila 2025-10-09 06:07:45 Re: Logical Replication of sequences