Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-05-30 14:08:38
Message-ID: CAAAe_zAuHwqUfqJOD4PDUkWsxTfTytNaandq11Kddw2bfCcpvQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

Resolved since the last post:

D1. Single-row frame conformance, Subclause 6.10.2 -- Tatsuo's call [6]
was to reject. Both ROWS BETWEEN CURRENT ROW AND CURRENT ROW and
AND 0 FOLLOWING are now rejected (nocfbot-0025), which in turn
unblocks the held ExecRPRProcessRow change (nocfbot-0026).

Open decisions (repeated with context at the end):

D2. The RPRContext consolidation -- Tatsuo's call as co-author;
non-blocking either way [4].
D3. The AST "absorption" rename -- Tatsuo's call [2].

[1] Jian's review, round 1 (2026-05-26):

https://postgr.es/m/CACJufxH-DZePhbdJM=8nNYceQiSbbXXLTw54iLhxiynQ+4hbBA@mail.gmail.com
[2] my round-1 reply -- AST "absorption" rename deferred to Tatsuo (D3,
2026-05-27):

https://postgr.es/m/CAAAe_zDephfiDA_A3FN0hCymJRogEr=Rt3QoCTf4qMYDLk+xNA@mail.gmail.com
[3] Jian's review, round 2 (2026-05-28):

https://postgr.es/m/CACJufxGX17thWuEOq1tM5xbRHz2HXm1asooZC3GV25MYGmYqLQ@mail.gmail.com
[4] my round-2 reply -- RPRContext deferred to Tatsuo (D2, 2026-05-29):

https://postgr.es/m/CAAAe_zAH0MvP0TBmW3PTLeHjEpiyBz0473zJRM9pwLpseefMNw@mail.gmail.com
[5] single-row frame conformance question (D1, 2026-05-29):

https://postgr.es/m/CAAAe_zCbSU=dd-4qTL2QaBQwQ-cf51N_851a9Y5rOoz0wj0aXw@mail.gmail.com
[6] Tatsuo's reply -- reject single-row frames (D1 resolved, 2026-05-30):

https://postgr.es/m/20260530.114737.1416684464524168377.ishii@postgresql.org
[7] Jian's review, round 3 (2026-05-30):

https://postgr.es/m/CACJufxEsaU8GQ4yeXTWhAO8VjbrZTh5CpvUqz=4a3T0Cwz44pA@mail.gmail.com

Attached: the v47 feature series (v47-0001..0009) rebased onto current
master, plus the incremental patch series carried on top of it.

Base:

9a41b34a287 2026-05-26 doc: add comma to UPDATE docs, for consistency

Unchanged -- rebase only (already posted; only rebased, no content change).
Titles for reference:

nocfbot-0001 Add DEFINE non-volatile baseline to rpr_integration B9
nocfbot-0002 Unify RPR DEFINE walkers and reject volatile callees
nocfbot-0003 Cover RPR empty-match path with EXPLAIN tests; fix stale
XXX comments
nocfbot-0004 Reclassify DEFINE qualifier check and reword diagnostic to
"expression"
nocfbot-0005 Sync stale comments on DEFINE/PATTERN handling
nocfbot-0006 Add trailing commas to RPR enum definitions
nocfbot-0007 Remove optional outer parentheses from ereport() calls in
RPR files
nocfbot-0008 Add high-water mark tracking to NFA visited bitmap reset
nocfbot-0009 Document DEFINE subquery rejection as intentional
over-rejection
nocfbot-0010 Remove duplicate #include in nodeWindowAgg.c
nocfbot-0011 Normalize SQL/RPR standard references
nocfbot-0012 Add rpr_integration B7 cases for RPR in recursive query
nocfbot-0013 Reject row pattern recognition in recursive queries
nocfbot-0014 Enhance README.rpr per Tatsuo Ishii's review
nocfbot-0015 Round out README.rpr WindowAggState field coverage
nocfbot-0016 Add raw_expression_tree_walker coverage for RPR raw nodes

(nocfbot-0016 was sent earlier as 0015; renumbered here so the review
series runs contiguously from 0017.)

New incremental patches -- nocfbot-0017 onward. These apply Jian He's
review (rounds 1 [1] and 2 [3]) and settle D1. nocfbot-0017..0024 and
0026 are comment / doc / test plus a signature change and a couple of
Assert additions -- no behavior change. nocfbot-0025 is the one
user-visible change: it rejects the single-row frame per D1 [6].

nocfbot-0017 Enhance README.rpr
Chapter VIII absorption intro + a worked PATTERN (A+) trace;
"Depth-First Search" spelled out at first use; the stale
nfa_advance(initialAdvance=...) reference replaced.

nocfbot-0018 Clarify execRPR.c comments and tighten an Assert
Document the NFA invariants (compareDepth slot arithmetic, the
asymmetric visited-marking scheme, the greedy/non-nullable BEGIN
label, the ALT depth break, the state->next reset boundary),
reframe ExecRPRFinalizeAllContexts as the partition-end policy
holder, and add a defensive Assert in nfa_advance_var.

nocfbot-0019 nfa_add_state_unique: bool return -> void
The return value was unused.

nocfbot-0020 Reluctant bounded mid-band test (rpr_nfa)
A{3,5}? B, which drives the VAR-level count in nfa_advance_var
through 3..5.

nocfbot-0021 Define RPR absorption terminology in README.rpr
Terminology block for the "match_start dep." column (none /
direct / boundary check), the "boundary chk" typo fix, the
count-dominance definition, and the match_start_dependent rename.

nocfbot-0022 Document the get_reduced_frame_status cascade invariant
The branches form an order-dependent early-return cascade; the
running invariant is spelled out so the order reads as intentional.

nocfbot-0023 Explain the completed-head-context branch in
update_reduced_frame
Why a head context can already be complete under SKIP TO NEXT ROW.

nocfbot-0024 Tighten the frame-boundary check from >= to ==
The > case is unreachable by the loop invariant; the defense moves
into an Assert(currentPos <= ctxFrameEnd), and the comment changes
from "exceeded" to "reached".

nocfbot-0025 Reject single-row window frame in row pattern recognition
Per D1 [6], the frame end must be UNBOUNDED FOLLOWING or a positive
offset FOLLOWING. CURRENT ROW is rejected in transformRPR() at parse
time; a zero offset -- which need not be a constant -- in
calculate_frame_offsets() at run time. The two single-row rpr_base
tests become error cases, with a bind-parameter error test added.

nocfbot-0026 Remove the redundant zero check on the RPR frame ending
offset
With the single-row frame now rejected, a limited frame always
carries a real offset, so the "endOffsetValue != 0" guard -- which
compared a Datum directly to zero -- is dropped, leaving a plain
DatumGetInt64(). This is the offset half of Jian's ExecRPRProcessRow
cleanup; the structural half (dropping the hasLimitedFrame
parameter) I've left as-is -- it is loop-invariant and computed once
outside the per-row loop, so moving it into ExecRPRProcessRow would
just repeat the work each row.

Coverage -- my decision summaries [2], [4], with the patch each became:

Round 1 [2]
Short-circuit optimization Separate series -> separate series
Absorption README narrative Accept -> nocfbot-0017
AST-level "absorption" rename Pending Tatsuo -> D3
DFS expansion Accept -> nocfbot-0017
initialAdvance README mismatch Accept -> nocfbot-0017
Defensive Assert in advance_var Accept -> nocfbot-0018
Finalize unnecessary? Keep -> nocfbot-0018
Greedy comment label Accept -> nocfbot-0018
state->next reset Decline -> nocfbot-0018
count >= 3 test coverage Accept -> nocfbot-0020
visited marking purpose Accept -> nocfbot-0018
compareDepth comment Accept -> nocfbot-0018
Unused bool return Accept -> nocfbot-0019
ALT depth invariant Assert Decline -> nocfbot-0018

Round 2 [4]
RPRContext consolidation Tatsuo's call -> D2
get_reduced_frame_status order Keep -> nocfbot-0022
README terminology + typo Accept -> nocfbot-0021
ExecRPRProcessRow refactor Datum fix only -> nocfbot-0026
single-row frame (6.10.2, D1) Reject (Tatsuo) -> nocfbot-0025
currentPos >= -> == Accept -> nocfbot-0024
states == NULL branch coverage Already reached -> nocfbot-0023

Remaining work and decisions

Decisions (need your input):
D2 RPRContext consolidation -- Tatsuo.
D3 AST "absorption" rename -- Tatsuo.

Held pending a decision:
- the RPRContext consolidation itself -- done if D2 says go.
- the AST "absorption" rename itself -- done if D3 says go.

(D1 settled: the parse-time frame-end check, the offset-0 test as an
error case, and the Datum-comparison fix are now in nocfbot-0025/0026.)

From Jian's round-3 review [7] (into the next revision):
- four comment fixes: the line-number test anchors, the "at the END"
and uppercase-END wording, and the nfa_advance_var count premise.
- the make/clone rename of the NFA helpers (one of three attached
refactors); the other two -- dropping nfaStateSize and the elem
parameter -- I'd keep, both on hot-path grounds.

From our in-house review (separate follow-ups):
- a quantifier-normalization correctness fix (nested unbounded
quantifiers such as (A{2,})* are currently mis-normalized)
- a per-tuple memory-context fix in DEFINE evaluation (still verifying)
- smaller correctness/conformance fixes (an overflow guard, a few
missing parse-time checks, EXPLAIN output details)
- documentation gaps (comments, README, SGML)
- added regression tests (round-trip deparse, edge-case offsets)

Before the v48 fold (from Jian's off-list comments):
- INT_MAX -> PG_INT32_MAX (the unbounded-quantifier sentinel; ~24 sites)
- foreach + lfirst() -> foreach_node (~33 sites)
- foreach_current_index, dropping the redundant break (3 sites)

Work, no decision needed:
- short-circuit (lazy eval) -- stop evaluating a DEFINE predicate
once its outcome is fixed. A separate series.
It turns on a standard-interpretation point -- whether skipping is
sound when a dropped subexpression has side effects.

Thanks again to Jian for the careful reading.

Henson

Attachment Content-Type Size
nocfbot-0001-Add-DEFINE-non-volatile-baseline-to-rpr_integrati.txt text/plain 3.2 KB
nocfbot-0004-Reclassify-DEFINE-qualifier-check-and-reword-diag.txt text/plain 11.9 KB
nocfbot-0003-Cover-RPR-empty-match-path-with-EXPLAIN-tests-fix.txt text/plain 19.0 KB
nocfbot-0006-Add-trailing-commas-to-RPR-enum-definitions.txt text/plain 1.8 KB
nocfbot-0007-Remove-optional-outer-parentheses-from-ereport-ca.txt text/plain 18.8 KB
nocfbot-0009-Document-DEFINE-subquery-rejection-as-intentional.txt text/plain 2.6 KB
nocfbot-0005-Sync-stale-comments-on-DEFINE-PATTERN-handling.txt text/plain 4.4 KB
nocfbot-0002-Unify-RPR-DEFINE-walkers-and-reject-volatile-call.txt text/plain 69.1 KB
nocfbot-0010-Remove-duplicate-include-in-nodeWindowAgg.c.txt text/plain 1.1 KB
nocfbot-0008-Add-high-water-mark-tracking-to-NFA-visited-bitma.txt text/plain 5.0 KB
nocfbot-0011-Normalize-SQL-RPR-standard-references.txt text/plain 13.6 KB
nocfbot-0012-Add-rpr_integration-B7-cases-for-RPR-in-recursive.txt text/plain 8.2 KB
nocfbot-0014-Enhance-README.rpr-per-Tatsuo-Ishii-s-review.txt text/plain 5.7 KB
nocfbot-0013-Reject-row-pattern-recognition-in-recursive-queri.txt text/plain 7.0 KB
nocfbot-0015-Round-out-README.rpr-WindowAggState-field-coverag.txt text/plain 2.5 KB
nocfbot-0016-Add-raw_expression_tree_walker-coverage-for-RPR-r.txt text/plain 1.6 KB
nocfbot-0017-Enhance-README.rpr-per-Jian-He-s-review.txt text/plain 3.5 KB
nocfbot-0019-Change-nfa_add_state_unique-signature-from-bool-t.txt text/plain 2.4 KB
nocfbot-0018-Clarify-execRPR.c-comments-and-tighten-an-Assert-.txt text/plain 7.0 KB
nocfbot-0020-Add-reluctant-bounded-mid-band-test-to-rpr_nfa.txt text/plain 3.9 KB
nocfbot-0021-Define-RPR-absorption-terminology-in-README.rpr-p.txt text/plain 5.3 KB
nocfbot-0024-Tighten-the-RPR-frame-boundary-check-from-to-per-.txt text/plain 2.1 KB
nocfbot-0022-Document-the-get_reduced_frame_status-cascade-inv.txt text/plain 3.4 KB
nocfbot-0023-Explain-the-completed-head-context-branch-in-upda.txt text/plain 1.4 KB
nocfbot-0025-Reject-single-row-window-frame-in-row-pattern-rec.txt text/plain 8.1 KB
nocfbot-0026-Remove-the-redundant-zero-check-on-the-RPR-frame-.txt text/plain 1.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2026-05-30 17:12:39 Re: hashjoins vs. Bloom filters (yet again)
Previous Message Henson Choi 2026-05-30 13:45:39 Re: Row pattern recognition