Re: Questionable result from lead(0) IGNORE NULLS

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: ojford(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Questionable result from lead(0) IGNORE NULLS
Date: 2025-10-08 04:59:01
Message-ID: 20251008.135901.1201616738011376109.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Oliver,

>> Nothing appears for lead at all. So it was doing something but doesn't look
>> like it handles the lead(x, 0) case
>
> I think we need to change this:
>
> forward = relpos > 0 ? 1 : -1;
> :
> :
> /*
> * Get the next nonnull value in the partition, moving forward or backward
> * until we find a value or reach the partition's end.
> */
> do
> {
> int nn_info; /* NOT NULL info */
>
> abs_pos += forward;
> if (abs_pos < 0) /* apparently out of partition */
> break;
>
> In lead(0, x) case, abs_pos==0 and foward==-1. So it exits the loop
> due to out of partition. Probably we need to change
> forward = relpos > 0 ? 1 : -1;
> to
> forward = relpos >= 0 ? 1 : -1;
> and change the do..while loop to a for loop?

Attached patch is written in this direction. What do you think?
Below are the results. IMO now lead() returns correct results.

psql -a -f window.sql test
CREATE TEMP TABLE g(x INT, y INT);
CREATE TABLE
INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3));
INSERT 0 3
SELECT * FROM g;
x | y
---+---
| 1
| 2
1 | 3
(3 rows)

SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 |
| 2 |
1 | 3 | 1
(3 rows)

SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 | 1
| 2 | 1
1 | 3 | 1
(3 rows)

SELECT x, y, lead(x, 1) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 |
| 2 |
1 | 3 |
(3 rows)

While working on this, I found some of window function regression
tests using lead/lag are not quite correct. Below is some of them.

-- lead
SELECT name,
orbit,
lead(orbit) OVER w AS lead,
lead(orbit) RESPECT NULLS OVER w AS lead_respect,
lead(orbit) IGNORE NULLS OVER w AS lead_ignore
FROM planets
WINDOW w AS (ORDER BY name)
;
name | orbit | lead | lead_respect | lead_ignore
---------+-------+-------+--------------+-------------
earth | | 4332 | 4332 | 4332
jupiter | 4332 | | | 88
mars | | 88 | 88 | 88
mercury | 88 | 60182 | 60182 | 60182
neptune | 60182 | 90560 | 90560 | 90560
pluto | 90560 | 24491 | 24491 | 24491
saturn | 24491 | | | 224
uranus | | 224 | 224 | 224
venus | 224 | | |
xyzzy | | | |
(10 rows)

Why lead_ignore shows "4332" on the first row? Since "orbit"'s second
non null row is orbit==88, I think lead(orbit) should return 88,
rather than 4332 if my understanding of the SQL standard is correct.

IMO the right result is as below, which is actually the one after
applying the patch.

name | orbit | lead | lead_respect | lead_ignore
---------+-------+-------+--------------+-------------
earth | | 4332 | 4332 | 88
jupiter | 4332 | | | 88
mars | | 88 | 88 | 60182
mercury | 88 | 60182 | 60182 | 60182
neptune | 60182 | 90560 | 90560 | 90560
pluto | 90560 | 24491 | 24491 | 24491
saturn | 24491 | | | 224
uranus | | 224 | 224 |
venus | 224 | | |
xyzzy | | | |
(10 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment Content-Type Size
v1-0001-Fix-window-function-lead-lag-incorrect-results.patch application/octet-stream 2.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-10-08 05:02:03 Re: Add memory_limit_hits to pg_stat_replication_slots
Previous Message Bertrand Drouvot 2025-10-08 04:45:48 Re: Add stats_reset to pg_stat_all_tables|indexes and related views