From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | ojford(at)gmail(dot)com |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Questionable result from lead(0) IGNORE NULLS |
Date: | 2025-10-08 08:00:07 |
Message-ID: | 20251008.170007.1614484426305151178.ishii@postgresql.org |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Oliver,
After studying the standard more, it seems I was totally wrong.
In summary, current code is correct. Sorry for noise.
From the standard explaining lead():
> B) If OFFSET = 0 (zero), then the value of <window function> is the
> value of VE1 evaluated for the current row.
(here VE1 referes to the first argument of lead()).
So it seems we need to treat offset==0 case specially. i.e. eval VE1
and do not through away even if the result is NULL.
> I noticed a questionable result from "lead(0) IGNORE NULLS".
>
> CREATE TEMP TABLE g(x INT, y INT);
> CREATE TABLE
> INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3));
> INSERT 0 3
> SELECT * FROM g;
> x | y
> ---+---
> | 1
> | 2
> 1 | 3
> (3 rows)
>
> SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g
> WINDOW w AS (ORDER BY y);
> x | y | lead
> ---+---+------
> | 1 |
> | 2 |
> 1 | 3 | 1
> (3 rows)
>
> SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
> WINDOW w AS (ORDER BY y);
> x | y | lead
> ---+---+------
> | 1 |
> | 2 |
> 1 | 3 | 1
> (3 rows)
>
> As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as
> "lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show
> something like:
No. The result above is perfectly correct.
> if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0)
Now I understand why "relpos != 0" part is necessary.
Also, my email below was wrong.
> While working on this, I found some of window function regression
> tests using lead/lag are not quite correct. Below is some of them.
>
> -- lead
> SELECT name,
> orbit,
> lead(orbit) OVER w AS lead,
> lead(orbit) RESPECT NULLS OVER w AS lead_respect,
> lead(orbit) IGNORE NULLS OVER w AS lead_ignore
> FROM planets
> WINDOW w AS (ORDER BY name)
> ;
> name | orbit | lead | lead_respect | lead_ignore
> ---------+-------+-------+--------------+-------------
> earth | | 4332 | 4332 | 4332
> jupiter | 4332 | | | 88
> mars | | 88 | 88 | 88
> mercury | 88 | 60182 | 60182 | 60182
> neptune | 60182 | 90560 | 90560 | 90560
> pluto | 90560 | 24491 | 24491 | 24491
> saturn | 24491 | | | 224
> uranus | | 224 | 224 | 224
> venus | 224 | | |
> xyzzy | | | |
> (10 rows)
>
> Why lead_ignore shows "4332" on the first row? Since "orbit"'s second
> non null row is orbit==88, I think lead(orbit) should return 88,
> rather than 4332 if my understanding of the SQL standard is correct.
According to the standard if OFFSET is not 0, then:
> B) Otherwise, let TX be the sequence of values that is the result of applying VE1 to each
> row of T that follows the current row and eliminating null values, ordered according
> to the window ordering of WDX.
> C) Otherwise, the value of <window function> is the m-th value of
> TX, where m = OFFSET.
Thus TX does not include current row and for the first row, and
lead(orbit) returns the second row, that is orbit==4332 because m==1.
So, current regression test expected file is correct.
Again, sorry for noise.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
From | Date | Subject | |
---|---|---|---|
Next Message | v@viktorh.net | 2025-10-08 08:51:39 | Re: Allow ON CONFLICT DO UPDATE to return EXCLUDED values |
Previous Message | Jeremy Schneider | 2025-10-08 07:06:12 | Re: sync_standbys_defined and pg_stat_replication |