Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: assam258(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com
Cc: david(dot)g(dot)johnston(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-02-11 03:50:32
Message-ID: 20260211.125032.1207789314761104094.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Henson,

In my understanding RANGE/GROUPS are not allowed when RPR is defined.
(See ISO/IEC 19075-5 section 6.10.2 "ROWS BETWEEN CURRENT ROW AND").
So proper fix would be to error out at the parse/analyze phase if
RANGE/GROUPS are used when RPR is defined.

Vik, Jaconb,
Thoughts?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

> Hi Ishii-san,
>
> During code review of nodeWindowAgg.c, I discovered that
> update_reduced_frame()
> can receive the same pos value repeatedly for RANGE/GROUPS frames.
>
>
> ----------------------------------------------------------------------
> PATTERN DISCOVERED
> ----------------------------------------------------------------------
>
> For RANGE/GROUPS frames with peer rows (rows having the same ORDER BY
> value):
>
> Data: id=1,2,3 all have val=10 (peer rows)
> id=4,5 both have val=20 (peer rows)
>
> RANGE frame processing:
> - id=1 processing: pos=0 (frameheadpos points to first peer)
> - id=2 processing: pos=0 (same frameheadpos)
> - id=3 processing: pos=0 (same frameheadpos)
> - id=4 processing: pos=3
> - id=5 processing: pos=3
>
> ROWS frames always have sequential pos values (0,1,2,3,4...), but
> RANGE/GROUPS
> can have repeated pos values (0,0,0,3,3...).
>
>
> ----------------------------------------------------------------------
> CURRENT IMPLEMENTATION ISSUE
> ----------------------------------------------------------------------
>
> The current implementation appears to assume sequential processing only,
> which
> is valid for ROWS frames. Two specific issues found:
>
> 1. In update_reduced_frame():
> - The pos <= nfaLastProcessedRow check assumes "already processed pos
> doesn't
> need reprocessing", but RANGE/GROUPS require processing the same pos
> multiple times.
>
> 2. In nfa_process_row():
> - Frame boundary calculation: ctxFrameEnd = matchStartRow + frameOffset
> + 1
> - This adds frameOffset directly to row position, which works for ROWS
> but
> may be incorrect for RANGE (value-based) and GROUPS (group-based)
> frames.
>
>
> ----------------------------------------------------------------------
> TEST COVERAGE LIMITATION
> ----------------------------------------------------------------------
>
> The existing RANGE/GROUPS test cases in rpr_base.sql only cover simple
> scenarios
> that don't reach the frame end, which is why this issue wasn't detected.
>
> Most tests target ROWS frames, with only a small number testing RANGE/GROUPS
> frames.
>
>
> ----------------------------------------------------------------------
> PROPOSAL
> ----------------------------------------------------------------------
>
> Rather than directly modifying these functions, we should consider an
> alternative approach for non-UNBOUNDED frames:
> - Strictly respect the frame head position
> - Minimize lower-level optimizations (absorption/pruning)
> - Avoid creating subsequent contexts for limited reuse benefit
> - For bounded frames, the benefit of context reuse across rows may be
> limited
> compared to UNBOUNDED frames where contexts can be reused extensively
> - This would simplify RANGE/GROUPS handling at the cost of some optimization
>
> To implement this properly, we need to broaden our understanding of the
> execution structure through code review of Window clauses and Aggregate
> functions.
>
> Therefore, for this round of nodeWindowAgg.c review:
> 1. Perform code simplification/refactoring for ROWS frames only
> 2. Add FIXME comments to RANGE/GROUPS test cases
> 3. Defer RANGE/GROUPS handling to a future phase
>
> There may be other issues beyond RANGE/GROUPS.
> After this code review round is complete, I would like to regroup and
> discuss
> the direction forward.
>
> Best regards,
> Henson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2026-02-11 03:51:29 Re: [PATCH] Support automatic sequence replication
Previous Message Andreas Karlsson 2026-02-11 03:50:21 Re: PL/Julia: clarification on IN array parameters issue