Quick Links

Re: Row pattern recognition

From:	Tatsuo Ishii <ishii(at)postgresql(dot)org>
To:	assam258(at)gmail(dot)com
Cc:	vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Row pattern recognition
Date:	2026-02-15 09:06:52
Message-ID:	20260215.180652.676783304925959084.ishii@postgresql.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Henson,

Thanks for the patches! I applied them on top of v42 and rebased
against current master. Attached are the v43 patches.

> Hi Tatsuo,
>
> This round focused on reviewing nodeWindowAgg.c -- the NFA executor
> -- which was the last piece remaining on my review list. I've also
> added a comprehensive NFA runtime test suite.
>
> The attached patch continues the incremental series on top of v42.
> Here are the changes:
>
>
> FIXME issues documented:
>
> These require non-trivial structural changes, so I've documented
> them as FIXME comments for now rather than attempting a fix.
>
> - altPriority tracks only the last ALT choice, so repeated or nested
> ALTs like (A|B)+ cannot correctly implement SQL standard lexical
> ordering. A full-path classifier structure is needed.
>
> - Cycle prevention condition (count == 0 && min == 0) is insufficient
> for patterns like (A*)* where cycles occur at count > 0. Currently
> relies on implicit duplicate detection in nfa_add_state_unique.
>
>
> Bugs fixed:
>
> - Fix nfa_advance_begin() routing order: enter group first (lexically
> first), then skip path (lexically second).
>
> - Add ALT scope boundary check in nfa_advance_alt() and
> computeAbsorbabilityRecursive() to stop branch traversal when
> element depth exits ALT scope.
>
> - Disallow RANGE and GROUPS frame types with row pattern
> recognition, as the SQL standard only requires ROWS.
> Also remove the now-dead frame-type determination code.
>
>
> NFA executor refactoring (nodeWindowAgg.c):
>
> - Simplify nfa_process_row() phase 1: remove nfa_advance() call
> after forced mismatch at frame boundary, since all states are
> already at VAR positions and mismatch removes them. Replace
> the redundant phase 3 frame boundary check with Assert.
>
> - Simplify update_reduced_frame() result registration: mismatch
> path now returns early, and the post-processing SKIP mode cleanup
> block (nfa_remove_contexts_up_to vs nfa_context_free) is removed
> since eager pruning handles SKIP PAST LAST ROW and both paths
> now simply call nfa_context_free().
>
> - Replace nfa_remove_contexts_up_to() with eager pruning inside
> nfa_add_matched_state(). When matchEndRow is extended during
> greedy matching, pending contexts within the match range are
> pruned incrementally instead of after match completion. For
> SKIP PAST LAST ROW patterns like START UP+, this reduces the
> number of live contexts at each row from O(n) to O(1), avoiding
> O(n^2) per-row processing of contexts that would be skipped
> anyway.
>
> - Optimize nfa_states_equal() to compare counts only up to the
> current element's depth instead of the full maxDepth.
>
> - Rename nfa_state_clone -> nfa_state_create,
> nfa_find_context_for_pos -> nfa_get_head_context for clarity.
>
> - Add nfa_record_context_success/skipped/absorbed statistics helpers.
>
> - Remove unused RPRPattern parameter from
> nfa_update_absorption_flags().
>
>
> Absorbability refactoring (rpr.c):
>
> - Remove parentDepth parameter from isUnboundedStart() and
> computeAbsorbabilityRecursive(). Scope boundaries are now
> determined by element depth comparison instead of an explicit
> parent depth parameter.
>
> - Simplify isUnboundedStart(): check simple VAR case first, then
> GROUP case with a depth-bounded loop instead of a FIN-terminated
> traversal with multiple break conditions.
>
>
> Test changes:
>
> - Add rpr_nfa.sql: comprehensive NFA runtime test suite covering
> quantifier boundaries (min/max/exact), alternation priority, nested
> patterns, frame boundary variations, INITIAL mode (syntax error
> expected), pathological patterns, context absorption, and FIXME
> reproduction cases.
>
> - Add nth_value out-of-frame tests, ReScan/LATERAL test, and
> nth_value IGNORE NULLS test to rpr.sql.
>
> - Change selected EXPLAIN test queries in rpr_explain.sql from
> EXPLAIN (COSTS OFF) to EXPLAIN (ANALYZE, ...) to verify actual
> NFA execution statistics.
>
> - Fix stale comments across rpr.sql, rpr_base.sql, and rpr_nfa.sql:
> remove resolved BUG annotations, update error messages to match
> actual output, correct optimization result descriptions, and
> standardize Expected comment placement to after SQL statements.
>
>
> Other changes:
>
> - Run pgindent on RPR source files. Add /*----------*/ comment
> guards to protect structured comments from reformatting.
>
> Coverage:
>
> I ran gcov on the modified lines (diff-only coverage). The attached
> coverage.zip contains an HTML report. Summary:
>
> - 18 files, 124 functions, 2238 modified lines analyzed
> - Overall: 92.1% line coverage (2061/2238)
> - Core files (nodeWindowAgg.c, rpr.c, explain.c, parse_rpr.c):
> 98.0-98.5% each
> - Node serialization (outfuncs.c, readfuncs.c, equalfuncs.c):
> 0% -- these implement RPRPattern serialization for plan caching,
> but regression tests don't exercise prepared statements with RPR
> - ruleutils.c: 96.3% -- untested lines are the reluctant quantifier
> display path ({1}?), which is currently rejected at parse time

Thanks for the report.

> The node serialization functions (141 lines, 0% coverage) are the
> largest untested area. I'm not sure how to trigger these paths
> in the regression test framework. Any suggestions?

I think we can leave it as it is, until reluctant quantifier is
implemented.

> I'll send a separate email within a few days listing the FIXME
> issues and other unresolved items from the mailing list discussion
> for your review.

Looking forward to reading your email.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment	Content-Type	Size
v43-0001-Row-pattern-recognition-patch-for-raw-parser.patch	application/octet-stream	32.2 KB
v43-0002-Row-pattern-recognition-patch-parse-analysis.patch	application/octet-stream	27.4 KB
v43-0003-Row-pattern-recognition-patch-rewriter.patch	application/octet-stream	5.8 KB
v43-0004-Row-pattern-recognition-patch-planner.patch	application/octet-stream	65.5 KB
v43-0005-Row-pattern-recognition-patch-executor-and-comma.patch	application/octet-stream	102.5 KB
v43-0006-Row-pattern-recognition-patch-docs.patch	application/octet-stream	16.0 KB
v43-0007-Row-pattern-recognition-patch-tests.patch	application/octet-stream	868.1 KB
v43-0008-Row-pattern-recognition-patch-typedefs.list.patch	application/octet-stream	1.1 KB

In response to

Re: Row pattern recognition at 2026-02-14 14:58:10 from Henson Choi

Responses

Re: Row pattern recognition at 2026-02-16 15:09:04 from Henson Choi

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Rahila Syed	2026-02-15 12:40:06	Re: Enhancing Memory Context Statistics Reporting
Previous Message	Reshmithaa	2026-02-15 08:53:10	Crash related to Shared Memory