From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | ojford(at)gmail(dot)com |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Questionable result from lead(0) IGNORE NULLS |
Date: | 2025-10-08 04:59:01 |
Message-ID: | 20251008.135901.1201616738011376109.ishii@postgresql.org |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Oliver,
>> Nothing appears for lead at all. So it was doing something but doesn't look
>> like it handles the lead(x, 0) case
>
> I think we need to change this:
>
> forward = relpos > 0 ? 1 : -1;
> :
> :
> /*
> * Get the next nonnull value in the partition, moving forward or backward
> * until we find a value or reach the partition's end.
> */
> do
> {
> int nn_info; /* NOT NULL info */
>
> abs_pos += forward;
> if (abs_pos < 0) /* apparently out of partition */
> break;
>
> In lead(0, x) case, abs_pos==0 and foward==-1. So it exits the loop
> due to out of partition. Probably we need to change
> forward = relpos > 0 ? 1 : -1;
> to
> forward = relpos >= 0 ? 1 : -1;
> and change the do..while loop to a for loop?
Attached patch is written in this direction. What do you think?
Below are the results. IMO now lead() returns correct results.
psql -a -f window.sql test
CREATE TEMP TABLE g(x INT, y INT);
CREATE TABLE
INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3));
INSERT 0 3
SELECT * FROM g;
x | y
---+---
| 1
| 2
1 | 3
(3 rows)
SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 |
| 2 |
1 | 3 | 1
(3 rows)
SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 | 1
| 2 | 1
1 | 3 | 1
(3 rows)
SELECT x, y, lead(x, 1) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 |
| 2 |
1 | 3 |
(3 rows)
While working on this, I found some of window function regression
tests using lead/lag are not quite correct. Below is some of them.
-- lead
SELECT name,
orbit,
lead(orbit) OVER w AS lead,
lead(orbit) RESPECT NULLS OVER w AS lead_respect,
lead(orbit) IGNORE NULLS OVER w AS lead_ignore
FROM planets
WINDOW w AS (ORDER BY name)
;
name | orbit | lead | lead_respect | lead_ignore
---------+-------+-------+--------------+-------------
earth | | 4332 | 4332 | 4332
jupiter | 4332 | | | 88
mars | | 88 | 88 | 88
mercury | 88 | 60182 | 60182 | 60182
neptune | 60182 | 90560 | 90560 | 90560
pluto | 90560 | 24491 | 24491 | 24491
saturn | 24491 | | | 224
uranus | | 224 | 224 | 224
venus | 224 | | |
xyzzy | | | |
(10 rows)
Why lead_ignore shows "4332" on the first row? Since "orbit"'s second
non null row is orbit==88, I think lead(orbit) should return 88,
rather than 4332 if my understanding of the SQL standard is correct.
IMO the right result is as below, which is actually the one after
applying the patch.
name | orbit | lead | lead_respect | lead_ignore
---------+-------+-------+--------------+-------------
earth | | 4332 | 4332 | 88
jupiter | 4332 | | | 88
mars | | 88 | 88 | 60182
mercury | 88 | 60182 | 60182 | 60182
neptune | 60182 | 90560 | 90560 | 90560
pluto | 90560 | 24491 | 24491 | 24491
saturn | 24491 | | | 224
uranus | | 224 | 224 |
venus | 224 | | |
xyzzy | | | |
(10 rows)
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Fix-window-function-lead-lag-incorrect-results.patch | application/octet-stream | 2.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-10-08 05:02:03 | Re: Add memory_limit_hits to pg_stat_replication_slots |
Previous Message | Bertrand Drouvot | 2025-10-08 04:45:48 | Re: Add stats_reset to pg_stat_all_tables|indexes and related views |