Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, pramsey(at)cleverelephant(dot)ca, ojford(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, krasiyan(at)gmail(dot)com, vik(at)postgresfriends(dot)org, andrew(at)tao11(dot)riddles(dot)org(dot)uk, david(at)fetter(dot)org
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date: 2025-10-19 09:53:23
Message-ID: 20251019.185323.1616138750375579950.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>> 2. AFAICS there is only one notnull_info array, which amounts to
>>> assuming that the window function will have only one argument position
>>> that it calls WinGetFuncArgInFrame or WinGetFuncArgInPartition for.
>>> That may be true for the built-in functions but it seems mighty
>>> restrictive for extensions. Worse yet, there's no check, so that
>>> you'd just get silently wrong answers if two or more arguments are
>>> evaluated. I think there ought to be a separate array for each argno;
>>> of course only created if the window function actually asks for
>>> evaluations of a particular argno.
>>
>> I missed that. Thank you for pointed it out. I agree it would be
>> better allow to use multiple argument positions that calls
>> WinGetFuncArgInFrame or WinGetFuncArgInPartition in
>> extensions. Attached is a PoC patch for that.
>>
>> Currently there's an issue with the patch, however.
>>
>> SELECT x, y, mywindowfunc2(x, y, 2) IGNORE NULLS OVER w FROM g
>> WINDOW w AS (ORDER BY y);
>> psql:test2.sql:9: ERROR: cannot fetch row before WindowObject's mark position
>>
>> mywindowfunc2 is a user defined window function, taking 3 arguments. x
>> and y are expected to be evaluated to integer. The third argument is
>> relative offset to current row. In the query above x and y are
>> retrieved using two WinGetFuncArgInPartition() calls. The data set
>> (table "g") looks like below.
>>
>> x | y
>> ----+---
>> | 1
>> | 2
>> 10 | 3
>> 20 | 4
>> (4 rows)
>>
>> I think the cause of the error is:
>>
>> (1) WinGetFuncArgInPartition keep on fetching column x until it's
>> evalued to not null and placed in the second row (in this case that's
>> x==20). In WinGetFuncArgInPartition WinSetMarkPosition is called at
>> abs_pos==3.
>>
>> (2) WinGetFuncArgInPartition tries to fetch column y at row 0. Since
>> the mark was set to at row 3, the error occurred.
>>
>> To avoid the error, we could call WinGetFuncArgInPartition with
>> set_mark = false (and call WinSetMarkPosition separately) but I am not
>> sure if it's an acceptable solution.
>
> Attached is a v2 patch to fix the "cannot fetch row before
> WindowObject's mark position" error, by tweaking the logic to
> calculate the set mark position in WinGetFuncArgInPartition.

Attached is a v3 patch which is ready for commit IMO. Major
difference from v2 patch is, now the patch satisfies the request
below.

>>> of course only created if the window function actually asks for
>>> evaluations of a particular argno.

The NOT NULL information array is allocated only when the window
function actually asks for evaluations of a particular argno using
WinGetFuncArgInFrame or WinGetFuncArgInPartition.

If there's no objection, I am going to commit in a few days.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment Content-Type Size
v3-0001-Fix-multi-WinGetFuncArgInFrame-Partition-calls-wi.patch application/octet-stream 10.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-10-19 10:43:31 Re: FSM doesn't recover after zeroing damaged page.
Previous Message David Rowley 2025-10-19 08:52:23 Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop