Quick Links

Re: Row pattern recognition

From:	Henson Choi <assam258(at)gmail(dot)com>
To:	Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc:	vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Row pattern recognition
Date:	2026-02-26 13:46:01
Message-ID:	CAAAe_zDCwjuzomxnsR5uO-d7D-1RD53M7Nb1USTm738YyNRwpw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers,

I'd like to share an internal design document for the NFA executor
in the row pattern recognition (RPR) patch.

The attached guide covers:

- Pattern element structure (RPRPatternElement fields and flags)
- Compilation phases (AST to flat element array)
- Runtime execution model (Match / Absorb / Advance phases)
- Cycle detection for zero-consumption loops
- Empty-loop handling for nullable group bodies
- Reluctant quantifier support
- Absorption optimization for redundant context pruning

The full English text follows below. All three language
versions are also attached:

RPR_NFA_Algorithm_Guide.txt (English)
RPR_NFA_Algorithm_Guide_KO.txt (Korean)
RPR_NFA_Algorithm_Guide_JA.txt (Japanese)

These are reference documents intended to help reviewers understand
the NFA executor internals. They are not part of the patch itself.

================================================================================
PostgreSQL Row Pattern Recognition: Flat-Array Stream NFA Guide
================================================================================

Target audience: Developers with a basic understanding of the PostgreSQL
executor and planner architecture

Scope: The entire process from PATTERN/DEFINE clause parsing to NFA
runtime execution

Related code:
- src/backend/parser/parse_rpr.c (parser phase)
- src/backend/optimizer/plan/rpr.c (optimizer phase)
- src/backend/executor/nodeWindowAgg.c (executor phase)
- src/include/nodes/plannodes.h (plan node definitions)
- src/include/nodes/execnodes.h (execution state definitions)
- src/include/optimizer/rpr.h (types and constants)

================================================================================

What is a Flat-Array Stream NFA?

The NFA in this implementation is not a traditional state-transition graph
but a flat array of fixed-size 16-byte elements. At runtime, it processes
the row stream in a forward-only manner, expanding epsilon transitions
eagerly without backtracking.

- Flat-Array: Pattern compiled into a flat array,
not a graph (Chapter IV)
- Stream: Rows consumed sequentially in one direction,
never revisited (Chapter XII)
- NFA: Nondeterministic execution where multiple states
coexist within a single context (Chapter VI)

Chapter I Row Pattern Recognition Overview
================================================================================

Row Pattern Recognition (hereafter RPR) is a feature introduced in SQL:2016
that matches regex-based patterns against ordered row sets.

The SQL standard defines two forms:

Feature R010: MATCH_RECOGNIZE (FROM clause)
- Dedicated table operator
- Provides dedicated functions such as MATCH_NUMBER(), CLASSIFIER()
- Supports ONE ROW PER MATCH / ALL ROWS PER MATCH

Feature R020: RPR in a window (WINDOW clause)
- Integrated into the existing window function framework
- Supports ALL ROWS PER MATCH only
- No MATCH_NUMBER()

This implementation targets Feature R020.

The basic syntax is as follows:

SELECT ...
OVER (
PARTITION BY ...
ORDER BY ...
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
[INITIAL | SEEK] -- SEEK is defined in the standard but not
implemented
AFTER MATCH SKIP TO NEXT ROW | SKIP PAST LAST ROW
PATTERN ( <regex> )
DEFINE <variable> AS <condition>, ...
)

The PATTERN clause is a regular expression over row pattern variables.
The DEFINE clause specifies boolean conditions that determine whether each
variable evaluates to true for the current row.

Example:

PATTERN (A+ B)
DEFINE A AS price > PREV(price),
B AS price < PREV(price)

This pattern matches "a span where prices rise consecutively then drop."

Chapter II Overall Processing Pipeline
================================================================================

RPR processing is divided into three phases:

+------------------------------------------------------------+
| 1. Parsing (Parser) |
| SQL text -> PATTERN AST + DEFINE expression tree |
| |
| 2. Compilation (Optimizer/Planner) |
| PATTERN AST -> optimization -> flat NFA element array |
| |
| 3. Execution (Executor) |
| Row-by-row matching via NFA simulation |
+------------------------------------------------------------+

Each phase uses independent data structures, and the interfaces between
phases are well-defined:

Parser -> Planner: WindowClause.rpPattern (RPRPatternNode tree)
WindowClause.defineClause (TargetEntry list)

Planner -> Executor: WindowAgg.rpPattern (RPRPattern struct)
WindowAgg.defineClause (TargetEntry list)

Chapter III Parsing Phase
================================================================================

III-1. Entry Point

transformWindowDefinitions() (parse_clause.c)
+-- transformRPR() (parse_rpr.c)

transformRPR() is invoked when RPCommonSyntax is present and performs the
following:

(1) Frame option validation
- Only ROWS is allowed (RANGE, GROUPS are not)
- The start boundary must be CURRENT ROW
- EXCLUDE option is not allowed

(2) Transcription to WindowClause
- Copies rpPattern, rpSkipTo, initial fields

(3) DEFINE clause transformation (transformDefineClause)

III-2. PATTERN AST

The parser transforms the PATTERN clause into an RPRPatternNode tree.
Each node has one of the following four types:

RPR_PATTERN_VAR Variable reference. Name stored in varName field.
RPR_PATTERN_SEQ Concatenation. Children node list in children.
RPR_PATTERN_ALT Alternation. Branch node list in children.
RPR_PATTERN_GROUP Group (parentheses). Body node list in children.

All nodes have min/max fields to express quantifiers:

A -> VAR(A, min=1, max=1)
A+ -> VAR(A, min=1, max=INF)
A* -> VAR(A, min=0, max=INF)
A? -> VAR(A, min=0, max=1)
A{3,5} -> VAR(A, min=3, max=5)

If the reluctant field is not -1, the quantifier is reluctant (non-greedy).

Example: PATTERN ((A+ B) | C*)

ALT
+-- SEQ
| +-- VAR(A, 1, INF)
| +-- VAR(B, 1, 1)
+-- VAR(C, 0, INF)

III-3. DEFINE Clause Transformation

transformDefineClause() processes each DEFINE variable as follows:

(1) Checks for duplicate variable names
(2) Transforms the expression into a standard SQL expression
(3) Coerces to Boolean type (coerce_to_boolean)
(4) Wraps in a TargetEntry with the variable name set in resname

Variables that are used in PATTERN but not defined in DEFINE are implicitly
evaluated as TRUE (matching all rows).

Chapter IV Compilation Phase
================================================================================

IV-1. Entry Point

create_windowagg_plan() (createplan.c)
+-- collectPatternVariables() Collect variable names
+-- filterDefineClause() Remove unused DEFINE entries
+-- buildRPRPattern() NFA compilation (6 phases)

IV-2. The 6 Phases of buildRPRPattern()

Phase 1: AST optimization (optimizeRPRPattern)
Phase 2: Statistics collection (scanRPRPattern)
Phase 3: Memory allocation (allocateRPRPattern)
Phase 4: NFA element fill (fillRPRPattern)
Phase 5: Finalization (finalizeRPRPattern)
Phase 6: Absorbability analysis (computeAbsorbability)

IV-3. Phase 1: AST Optimization

After copying the parser-generated AST, the following optimizations are
applied:

(a) SEQ flattening: Unwrap nested SEQ nodes
SEQ(A, SEQ(B, C)) -> SEQ(A, B, C)

(b) Consecutive variable merging: Merge consecutive occurrences of the
same variable into a single quantifier
A A -> A{2}
A{2,3} A{1,2} -> A{3,5}

(d) Consecutive ALT merging: Merge repeated identical ALT nodes
(A | B) (A | B) (A | B) -> (A | B){3}

(e) Prefix/suffix absorption: Absorb identical sequences before/after
a group
A B (A B)+ -> (A B){2,INF}

(f) ALT flattening and deduplication
(A | (B | C)) -> (A | B | C)
(A | B | A) -> (A | B)

(g) Quantifier multiplication: Collapse nested quantifiers when safe
(A+)+ -> A+
(A{2,3}){5} -> A{10,15}

(h) Single-child unwrap
SEQ(A) -> A, (A){1,1} -> A

IV-4. Phase 4: NFA Element Array Generation

Transforms the optimized AST into a flat array of RPRPatternElement.
This is the core data structure used for NFA simulation at runtime.

RPRPatternElement struct (16 bytes):

Field Size Description
---------------------------------------------------------
varId 1B Variable ID (0-251) or control code (252-255)
depth 1B Group nesting depth
flags 1B Bit flags (see below)
reserved 1B Padding
min 4B Quantifier lower bound
max 4B Quantifier upper bound
next 2B Next element index (sequential flow)
jump 2B Branch target index (for ALT/GROUP)

Control codes:

RPR_VARID_BEGIN (252) Group start marker
RPR_VARID_END (253) Group end marker
RPR_VARID_ALT (254) Alternation start marker
RPR_VARID_FIN (255) Pattern completion marker

Element flags (1 byte, bitmask):

Bit Constant Set on Description
--------------------------------------------------------------------------
0x01 RPR_ELEM_RELUCTANT VAR, BEGIN, Non-greedy quantifier.
END Prefers shorter match: try
exit-loop first, then
repeat.
Set on VAR for simple
(A+?),
on BEGIN+END for group
((...)+?)

0x02 RPR_ELEM_EMPTY_LOOP END Group body can produce
empty
match (all children
nullable).
Creates a fast-forward
exit clone
alongside the normal
loop-back so
cycle detection doesn't
kill
legitimate matches. (IV-4b)

0x04 RPR_ELEM_ABSORBABLE_BRANCH VAR, BEGIN, Element lies within an
END, ALT absorbable region. Used at
runtime to track whether
the
current NFA state is in an
absorbable context.

0x08 RPR_ELEM_ABSORBABLE VAR, END Absorption judgment point.
WHERE to compare
consecutive
iterations for absorption.
- Simple unbounded VAR
(A+):
set on the VAR itself
- Unbounded GROUP ((A B)+):
set on the END element
only

Accessor macros:
RPRElemIsReluctant(e) (e)->flags & 0x01
RPRElemCanEmptyLoop(e) (e)->flags & 0x02
RPRElemIsAbsorbableBranch(e) (e)->flags & 0x04
RPRElemIsAbsorbable(e) (e)->flags & 0x08

Example: PATTERN (A+ B | C)

AST: ALT(SEQ(VAR(A,1,INF), VAR(B,1,1)), VAR(C,1,1))

Compilation result:

idx varId depth min max next jump Description
---------------------------------------------------
0 ALT 0 1 1 1 -1 Alternation start
1 A(0) 1 1 INF 2 3 Branch 1: A+
2 B(1) 1 1 1 4 -1 Branch 1: B -> FIN
3 C(2) 1 1 1 4 -1 Branch 2: C -> FIN
4 FIN 0 1 1 -1 -1 Pattern completion

- idx 0: ALT marker. next(=1) is the start of the first branch
- idx 1: Variable A. next(=2) is B, jump(=3) is the start of the second
branch
- idx 2: Variable B. next(=4) is FIN
- idx 3: Variable C. next(=4) is FIN
- idx 4: FIN marker. Match completion signal

Roles of next and jump:

- next: The next element to move to "after consuming" the current element.
For VAR, the next position after a successful match.
For BEGIN/END, the next position inside/outside the group.

- jump: The element to "skip to."
In ALT, a jump from one branch to the next branch.
In BEGIN, a skip path to END+1 (for groups with min=0).
In END, a loop-back to the start of the group body.

Example: PATTERN ((A B)+)

idx varId depth min max next jump Description
---------------------------------------------------
0 BEGIN 0 1 INF 1 4 Group start
1 A(0) 1 1 1 2 -1 A
2 B(1) 1 1 1 3 -1 B
3 END 0 1 INF 4 1 Group end
4 FIN 0 1 1 -1 -1 Pattern completion

- idx 0: BEGIN. next(=1) enters the group body.
jump(=4) skips to after END = FIN (used when min=0).
- idx 3: END. next(=4) exits the group.
jump(=1) loops back to the start of the group body.

IV-4a. Reluctant Flag (RPR_ELEM_RELUCTANT)

The reluctant flag is set during Phase 4 (fillRPRPattern) when the AST node
has a non-negative reluctant field (reluctant >= 0). It reverses the
priority
of quantifier expansion at runtime:

Greedy (default): try loop-back first, then exit (prefer longer match)
Reluctant: try exit first, then loop-back (prefer shorter match)

The flag is set on all elements that carry the quantifier:

Simple VAR (A+?): RPR_ELEM_RELUCTANT on the VAR element
Group ((...)+?): RPR_ELEM_RELUCTANT on BEGIN and END elements

At runtime (nfa_advance), the flag controls DFS exploration order:

VAR with quantifier:
Greedy: primary path = next (continue), clone = jump (skip)
Reluctant: primary path = jump (skip), clone = next (continue)

END element:
Greedy: primary path = jump (loop-back), clone = next (exit)
Reluctant: primary path = next (exit), clone = jump (loop-back)

BEGIN with min=0:
Greedy: primary path = next (enter group), clone = jump (skip)
Reluctant: primary path = jump (skip), clone = next (enter group)

The absorption optimization requires greedy quantifiers. Reluctant
quantifiers are excluded from absorbability analysis (see IV-5).

IV-4b. Empty Loop Flag (RPR_ELEM_EMPTY_LOOP)

The empty-loop flag is set during Phase 4 (fillRPRPatternGroup) on the END
element when the group body is nullable -- i.e., every path through the
body can match zero rows (all children are nullable).

Example patterns that trigger this flag:

(A?)* A is nullable (min=0), so group body is nullable -> END gets flag
(A? B?)+ Both children nullable -> body nullable -> END gets flag
(A | B*) B* is nullable, making the ALT nullable -> END gets flag

The flag works in conjunction with the zero-consumption cycle detection
(elemIdx visited bitmap). Without this flag, cycle detection alone would
cause legitimate matches to fail.

Problem example: (A*){2,3}
- Iteration 1: A* consumes all available rows -> count=1, END reached
- Loop-back for iteration 2: A* matches zero rows -> END reached again
- Cycle detection sees the same elemIdx on the same row -> state killed
- count never reaches min(2) -> match fails (incorrect)

With the RPR_ELEM_EMPTY_LOOP flag, nfa_advance_end creates two paths:
the normal loop-back (which cycle detection will eventually kill) and
a fast-forward exit clone that bypasses the loop entirely.
(See IX-4(c) for detailed runtime behavior.)
- Zero-consumption is impossible since body is not nullable

IV-5. Absorbability Analysis (RPR_ELEM_ABSORBABLE)

Context absorption is an optimization technique that reduces O(n^2) to O(n).
(Runtime behavior is described in Chapter VIII.)

This phase determines whether the pattern has a structure suitable for the
absorption optimization and sets flags on the relevant elements:

RPR_ELEM_ABSORBABLE Absorption comparison point
RPR_ELEM_ABSORBABLE_BRANCH Element within an absorbable region

Eligibility conditions:

(1) SKIP PAST LAST ROW (not NEXT ROW)
(2) Frame end is UNBOUNDED FOLLOWING

Structural conditions (isUnboundedStart + computeAbsorbabilityRecursive):

Case 1: Simple VAR+ (e.g., A+)
-> ABSORBABLE | ABSORBABLE_BRANCH set on the VAR
Case 2: GROUP+ whose body consists only of {1,1} VARs (e.g., (A B)+)
-> ABSORBABLE_BRANCH on children,
ABSORBABLE | ABSORBABLE_BRANCH on END
Case 3: GROUP+ whose body starts with VAR+ (e.g., (A+ B)+)
-> Recurses from BEGIN into the body, applying Case 1.
ABSORBABLE | ABSORBABLE_BRANCH set on A.
B and END get no flags -> absorption stops once past A.

Absorbability is determined per-element, not per-pattern.
Absorption comparison is performed only when a state resides at an
element with the RPR_ELEM_ABSORBABLE flag. Once a state leaves the
flagged region, absorption is permanently disabled for that state.

Through this mechanism, the runtime guarantees monotonicity:
"a context that started earlier always subsumes a context that
started later."

Chapter V NFA Runtime Data Structures
================================================================================

V-1. RPRNFAState -- NFA State

A single NFA state represents "how far the pattern has progressed."

Field Description
---------------------------------------------------------
elemIdx Index of the current pattern element
counts[] Repetition count per group depth
isAbsorbable Whether the state is in an absorbable region
next Next state in the linked list

The size of the counts array is (maxDepth + 1), allocated as a flexible
array member at the end of the struct.

Example: In PATTERN ((A B)+ C), a state waiting for B in the 3rd iteration

Element array: [0:BEGIN(d0) 1:A(d1) 2:B(d1) 3:END(d0) 4:C(d0) 5:FIN]

elemIdx = 2 (B, depth 1)
counts[0] = 2 (depth 0: depth of END. Group completed 2 iterations)
counts[1] = 1 (depth 1: depth of B. A matched in current iteration)

Counts are indexed by depth, not by elemIdx.
counts[0] is incremented when passing through END(depth 0),
and the group repetition count is preserved even when
the state is at B(depth 1).

Definition of two states being "equal":

Two states are equal if they have the same elemIdx and the same counts
up to the depth of that element.
nfa_states_equal() compares counts[0..elem->depth] using memcmp.
Only counts at or below the depth of the current element are meaningful.

V-2. RPRNFAContext -- Matching Context

A single context represents "a matching attempt started from a specific
start row."

Field Description
---------------------------------------------------------
states Linked list of active NFA states
matchStartRow Row number where matching started
matchEndRow Row number where matching completed (-1 if
incomplete)
lastProcessedRow Last row processed
matchedState State that reached FIN (for greedy fallback)
hasAbsorbableState Whether this context can absorb other contexts
allStatesAbsorbable Whether this context can be absorbed
next, prev Doubly-linked list

Since the NFA is nondeterministic, multiple states can coexist
simultaneously within a single context.

Example: In PATTERN (A | B) C, if the first row matches both A and B,
two states coexist within the context:

State 1: elemIdx=3 (waiting for C, via branch A)
State 2: elemIdx=3 (waiting for C, via branch B)

In this case, since the (elemIdx, counts) of the two states are equal,
nfa_add_state_unique() retains only State 1 (branch A), which was
added first.
Because DFS processes the first branch of ALT first, the state via A
is registered first, and the state via B is discarded as a duplicate.
This is the preferment guarantee.

V-3. RPR Fields of WindowAggState

nfaContext / nfaContextTail Doubly-linked list of active contexts
nfaContextFree Reuse pool for contexts
nfaStateFree Reuse pool for states
nfaVarMatched Per-row cache: varMatched[varId]
nfaVisitedElems Bitmap for cycle detection
nfaStateSize Precomputed size of RPRNFAState

Memory management:

States and contexts are managed through their own free lists.
Instead of palloc, they are obtained from the reuse pool, and
returned to the pool upon deallocation.
This reduces the overhead of frequent allocation/deallocation.

Chapter VI NFA Execution: 3-Phase Model
================================================================================

VI-1. Entry Point and Overall Flow

When the window function processes each row, row_is_in_reduced_frame()
is called. This function determines whether the current row belongs to
a matched frame, and if necessary, calls update_reduced_frame() to
drive the NFA.

Flow of update_reduced_frame():

(1) Find or create a context for the target row
(2) Enter the row processing loop
(3) After the loop ends, record the result in reduced_frame_map

Pseudocode of the row processing loop:

targetCtx = nfa_get_head_context(pos)
if targetCtx == NULL:
targetCtx = nfa_start_context(pos)

for currentPos = startPos ...
if not nfa_evaluate_row(currentPos): -- row does not exist
nfa_finalize_all_contexts() -- finalize all contexts
break

nfa_process_row(currentPos) -- 3-phase processing
nfa_start_context(currentPos + 1) -- pre-create next start point
nfa_cleanup_dead_contexts() -- remove dead contexts

if targetCtx->states == NULL: -- target context exhausted
break

Key point: Processing a single row may require processing multiple rows
ahead. Due to the nature of window functions, determining the frame for
row N requires looking at rows beyond N.

VI-2. Context Creation: nfa_start_context()

Creates a new context and performs the initial advance.

(1) Allocate context via nfa_context_alloc()
(2) Set matchStartRow = pos
(3) Create initial state: elemIdx=0 (first pattern element), counts=all
zero
(4) Call nfa_advance(initialAdvance=true)

The initial advance expands epsilon transitions at the beginning of
the pattern. For example, the initial advance for PATTERN ((A | B) C):

Start: elemIdx=0 (ALT)
-> Expand ALT branches
-> elemIdx=1 (A) -- VAR, so add state; stop here
-> elemIdx=2 (B) -- VAR, so add state; stop here

Result: Two states in the context {waiting for A, waiting for B}

During the initial advance, reaching FIN is not recorded as a match.
This is to prevent empty matches (zero-length matches).

VI-3. Row Evaluation: nfa_evaluate_row()

Evaluates all variable conditions in the DEFINE clause at once for
the current row.

for each defineClause[i]:
result = ExecEvalExpr(defineClause[i])
varMatched[i] = (not null and true)

To support row navigation operators such as PREV() and NEXT(),
the previous row, current row, and next row are set in separate slots:

ecxt_scantuple = previous row (for PREV reference)
ecxt_outertuple = current row (default reference)
ecxt_innertuple = next row (for NEXT reference)

The varMatched array is referenced later in Phase 1 (Match).

VI-4. nfa_process_row(): 3-Phase Processing

NFA processing for a single row is divided into three phases:

This ordering is important:

- Match executes first to "consume the current row."
- Absorb executes immediately after Match, when states have been updated.
- Advance executes last to prepare "states waiting for the next row."

Chapter VII Phase 1: Match
================================================================================

nfa_match() iterates through each state in the context:

(1) Check whether the state's elemIdx is a VAR element
(2) Compare against the current row using nfa_eval_var_match()
(3) Match success: increment repetition count, retain state
(4) Match failure: remove state

Match determination (nfa_eval_var_match):

If varId is within the range of defineVariableList:
Use the value of varMatched[varId]

If varId exceeds the range (variable not defined in DEFINE):
Unconditionally true (matches all rows)

Immediate advance for simple VARs:

For a VAR with min=1, max=1 where the next element is END,
the Match phase processes through END immediately.
This is necessary for accurate state comparison in Phase 2 (Absorb).

Example: In PATTERN ((A B)+), when A matches, it immediately advances
to B, and when B matches, it immediately advances through END to
complete the group count. This enables absorption comparison with
other contexts.

Chapter VIII Phase 2: Absorb (Context Absorption)
================================================================================

VIII-1. Problem

In the current implementation, a new context is started for each row
processed.
Applying PATTERN (A+) to 10 rows produces 10 contexts,
each of which tracks state independently.

If there are N rows, the total number of states becomes O(N^2):

Context 1 (started at row 1): can match A up to N times
Context 2 (started at row 2): can match A up to N-1 times
...
Context N (started at row N): can match A 1 time

VIII-2. Solution: Context Absorption

Key observation: a context started earlier contains
all matches of a later-started context (monotonicity principle).

If Context 1 started at row 1 and matched A 5 times,
the state where Context 2 (started at row 2) matched A 4 times
is already contained within Context 1.

Therefore Context 2 can be "absorbed" into Context 1.

VIII-3. Absorption Conditions

(1) The pattern is marked as isAbsorbable (see IV-5)
(2) allStatesAbsorbable of the target context is true
(3) An earlier context "covers" all states of the target

Cover condition (nfa_states_covered):

A state with the same elemIdx exists in the earlier context,
and the count at that depth is greater than or equal -- then it is
covered.

VIII-4. Dual-Flag Design

Two boolean flags make the absorption decision efficient:

hasAbsorbableState (monotonic: only true->false transition possible)
"Does this context have the ability to absorb other contexts?"
true if at least one absorbable state exists.
Transitions to false when states are removed leaving no absorbable
states.
Once false, it never becomes true again.

allStatesAbsorbable (dynamic: can fluctuate)
"Can this context be absorbed?"
true if all states are in an absorbable region.
Becomes false when a non-absorbable state is added; reverts to true
when it is removed.

VIII-5. Absorption Order

nfa_absorb_contexts() traverses from tail (newest) to head (oldest).

for ctx = tail to head:
if ctx.allStatesAbsorbable:
for older = ctx.prev to head:
if older.hasAbsorbableState:
if nfa_states_covered(older, ctx):
free(ctx) -- absorbed
break

Since inspection starts from the newest context, the most recently started
(= having the shortest match) context is absorbed first.

Chapter IX Phase 3: Advance (Epsilon Transition Expansion)
================================================================================

IX-1. Overview

nfa_advance() expands epsilon transitions from each state after Match,
generating "new states waiting for the next row."

An epsilon transition is a transition that moves without consuming a row:

- ALT: branch to each alternative
- BEGIN: enter group (or skip if min=0)
- END: loop-back within group (or exit when condition is met)
- FIN: record match completion
- VAR loop/exit: repeat/exit according to the quantifier

Expansion stops upon reaching a VAR element, and the state is added.
This is because VAR is the element that "will consume the next row."

IX-2. Processing Order: DFS and Preferment

advance processes states in lexicographic order,
performing Depth-First Search (DFS) on each state.

This DFS order is what guarantees the SQL standard's "preferment":

The branch that appears first in the PATTERN text takes precedence.

Example: PATTERN (A | B) C

The first branch A of the ALT takes precedence over the second branch B.
When both A and B can match, the match via A is selected.

nfa_add_state_unique() prevents duplicate addition of the same state,
so the state added first (= from the preferred branch) is retained.

IX-3. Routing Function: nfa_route_to_elem()

All inter-element transitions in the advance phase go through
nfa_route_to_elem().
This function branches its behavior based on the type of the next element:

If the next element is VAR:
(1) Add the state to the context (nfa_add_state_unique)
(2) If the VAR has min=0, also add a skip path (recurse via next)
-> Expansion stops here (VAR is the element that "will consume the next
row")

If the next element is non-VAR (ALT, BEGIN, END, FIN):
-> Recursively call nfa_advance_state() to continue expansion

With this structure, advance recursively follows epsilon transitions
until reaching a VAR, consistently stopping only at VAR elements.

IX-4. Per-Element advance Behavior

(a) ALT (nfa_advance_alt)

Upon encountering an ALT element, all branches are expanded in order.
The first element of each branch is connected via a jump pointer.

idx=0 (ALT) -> branch 1 start (next) -> branch 2 start (jump) -> ...

nfa_advance_state() is recursively called for each branch.

(b) BEGIN (nfa_advance_begin)

Handles group entry.
jump points to the element after END (= first element outside the group).

Greedy (default):
(1) Enter the group body (move via next, reset the count at that depth)
(2) If min=0, also add a group skip path (move via jump)

Reluctant:
Order reversed -- skip path first, group entry second.
If the skip path reaches FIN, the group entry path is not generated
(shortest match preferred).

Handles group termination. This is the core of the repetition logic.

Let count be the count at the current depth:

count < min:
Loop-back (move via jump, repeat the group body)

If the RPR_ELEM_EMPTY_LOOP flag is set:
In addition to loop-back, also add a fast-forward exit path.
This is because the body may produce an empty match, causing count
to never reach min. fast-forward resets counts[depth] to 0
and exits via next (treating the remaining required iterations
as empty matches).

min <= count < max:
Greedy: loop-back first, exit second
Reluctant: exit first, loop-back second
If the exit path reaches FIN, loop-back is omitted.

count >= max:
Unconditional exit (move via next)

On exit: reset counts[depth] = 0, and if the next element is an outer END,
increment the count at the outer depth.

(d) VAR (nfa_advance_var)

Handles repeat/exit for a VAR element with a quantifier.

Let count be the count at the current depth:

count < min:
Unconditional loop (stay at the same elemIdx, wait for the next row)

min <= count < max:
Greedy: loop first, exit (next) second
Reluctant: exit first, loop second
If the exit path reaches FIN, loop is omitted.

count >= max:
Unconditional exit (move via next)

On exit: reset counts[depth] = 0.

(e) FIN

Match success. The current state is moved to matchedState for recording,
and matchEndRow is set to the current row.

Upon reaching FIN, all remaining unprocessed states are removed
(early termination). By DFS order, the path that reached FIN first
has the highest preferment, so the rest are inferior paths.
This is the core mechanism that guarantees preferment.

In SKIP PAST LAST ROW mode, upon reaching FIN, subsequent contexts
that started within the match range are immediately pruned.

IX-5. State Deduplication: nfa_add_state_unique()

When adding a new state to a context, it is compared against existing
states;
if an identical state already exists, it is not added.

Comparison criteria: elemIdx + counts[0..elem->depth] (see V-1)

This deduplication is the core mechanism that suppresses NFA state
explosion.
Because DFS order causes preferred-branch states to be added first,
identical states from lower-priority branches are automatically discarded.

IX-6. Cycle Detection: nfaVisitedElems

When a group body can produce an empty match (zero consumption),
looping back from END may cause an infinite loop.

Example: PATTERN ((A?)*)

A? has min=0, so it can pass through without matching.
If the outer group repeats: BEGIN -> A? skip -> END -> BEGIN -> ...

To prevent this:

(1) At compile time: set the RPR_ELEM_EMPTY_LOOP flag on the END
of groups whose body is nullable.
The runtime effect of this flag is described in IX-4(c):
when count < min, a fast-forward exit path is added,
resolving the deadlock where count cannot increase due to empty
matches.

(2) At runtime: initialize the nfaVisitedElems bitmap immediately before
DFS expansion of each state within advance (once per state).
During DFS, set the corresponding elemIdx bit when visiting each
element.
If a previously visited elemIdx is revisited, that path is terminated.

Note: the bitmap tracks only elemIdx and does not consider counts.
Therefore, legitimate revisits to the same elemIdx but with different
counts
may also be blocked. This only occurs when the group body is nullable
(all paths can match empty), causing END -> loop-back -> skip -> END
within a single DFS. In such cases the END element has the
RPR_ELEM_EMPTY_LOOP flag, so the fast-forward exit (IX-4(c)) provides
an alternative path that bypasses the cycle.

Chapter X Match Result Processing
================================================================================

X-1. Reduced Frame Map

RPR match results are recorded in a byte array called reduced_frame_map.
One byte is allocated per row, and the value is one of the following:

RF_NOT_DETERMINED (0) Not yet processed
RF_FRAME_HEAD (1) Start row of the match
RF_SKIPPED (2) Interior row of the match (skipped in frame)
RF_UNMATCHED (3) Match failure

The window function references this map to determine frame inclusion for
each row.

X-2. AFTER MATCH SKIP

Determines the starting point for the next match attempt after a successful
match:

SKIP TO NEXT ROW:
New match attempt begins from the row after the match start row.
Overlapping matches are possible.

SKIP PAST LAST ROW:
New match attempt begins from the row after the match end row.
Only non-overlapping matches are possible.

X-3. INITIAL vs SEEK

Standard definition (section 6.12):
INITIAL: "is used to look for a match whose first row is R."
SEEK: "is used to permit a search for the first match anywhere
from R through the end of the full window frame."
In either case, if there is no match, the reduced window frame is empty.
The default is INITIAL.

Current implementation:
SEEK is not supported (the parser raises an error).
Only INITIAL is supported, searching only for matches starting at each
row position pos.

Chapter XI Worked Example: Full Execution Trace
================================================================================

XI-1. Query

SELECT company, tdate, price,
first_value(price) OVER w AS start_price,
last_value(price) OVER w AS end_price
FROM stock
WINDOW w AS (
PARTITION BY company
ORDER BY tdate
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A+ B)
DEFINE A AS price > PREV(price),
B AS price < PREV(price)
);

XI-2. Data

Row# tdate price
--------------------------
0 2024-01-01 100
1 2024-01-02 110
2 2024-01-03 120
3 2024-01-04 115
4 2024-01-05 130

XI-3. Compilation Result

PATTERN (A+ B) -> unchanged after optimization

idx varId depth min max next jump
------------------------------------------
0 A(0) 0 1 INF 1 -1 A+
1 B(1) 0 1 1 2 -1 B
2 FIN 0 1 1 -1 -1

DEFINE: A -> "price > PREV(price)", B -> "price < PREV(price)"
isAbsorbable = true (A+ is a simple unbounded VAR)

XI-4. Execution Trace

--- Row 0 (price=100) ---

update_reduced_frame(0) called.

Context C0 created (matchStartRow=0).
Initial advance: elemIdx=0(A) -> VAR, so state is added.
C0.states = [{elemIdx=0, counts=[0]}]

nfa_evaluate_row(0):
A: price(100) > PREV(price) -> no PREV -> false
B: price(100) < PREV(price) -> no PREV -> false
varMatched = [false, false]

nfa_process_row(0):
Phase 1 (Match): A(0) state vs varMatched[0]=false -> state removed
C0.states = [] (empty)

Phase 2 (Absorb): skipped (no states)
Phase 3 (Advance): skipped (no states)

C0.states is empty, so the loop terminates.
matchEndRow < matchStartRow -> RF_UNMATCHED.
register_reduced_frame_map(0, RF_UNMATCHED).

--- Row 1 (price=110) ---

update_reduced_frame(1) called.

Context C1 created (matchStartRow=1).
Initial advance: C1.states = [{elemIdx=0, counts=[0]}]

nfa_evaluate_row(1):
A: 110 > PREV(100) -> true
B: 110 < PREV(100) -> false
varMatched = [true, false]

nfa_process_row(1):
Phase 1 (Match): A(0) match succeeds -> counts[0]++ -> counts=[1]
C1.states = [{elemIdx=0, counts=[1]}]

Phase 3 (Advance):
State {elemIdx=0, counts=[1]}: A+ (min=1, count=1, max=INF)
count >= min, so:
Greedy -> loop first: keep {elemIdx=0, counts=[1]}
exit: reset counts[0]=0, next(=1) -> {elemIdx=1,
counts=[0]}
C1.states = [{elemIdx=0, counts=[1]}, {elemIdx=1, counts=[0]}]

--- Row 2 (price=120) ---

Context C2 created (matchStartRow=2).
Initial advance: C2.states = [{elemIdx=0, counts=[0]}]

nfa_evaluate_row(2):
A: 120 > PREV(110) -> true
B: 120 < PREV(110) -> false
varMatched = [true, false]

C1 nfa_process_row(2):
Phase 1 (Match):
{elemIdx=0, counts=[1]}: A matches -> counts=[2]
{elemIdx=1, counts=[0]}: B does not match -> removed
C1.states = [{elemIdx=0, counts=[2]}]

C2 nfa_process_row(2):
Phase 1 (Match):
{elemIdx=0, counts=[0]}: A matches -> counts=[1]
C2.states = [{elemIdx=0, counts=[1]}]

Phase 2 (Absorb):
Does C1 (started earlier) cover C2?
C1: {elemIdx=0, counts=[2]}, C2: {elemIdx=0, counts=[1]}
Same elemIdx, C1.counts >= C2.counts -> covered
C2 absorbed. -> removed.

Phase 3 (Advance):
{elemIdx=0, counts=[2]}: Greedy -> loop + exit
Loop: {elemIdx=0, counts=[2]}
Exit: reset counts[0]=0, next(=1) -> {elemIdx=1, counts=[0]}
C1.states = [{elemIdx=0, counts=[2]}, {elemIdx=1, counts=[0]}]

Context C3 created (matchStartRow=3).

--- Row 3 (price=115) ---

nfa_evaluate_row(3):
A: 115 > PREV(120) -> false
B: 115 < PREV(120) -> true
varMatched = [false, true]

nfa_process_row(3):
Phase 1 (Match):
{elemIdx=0, counts=[2]}: A does not match -> removed
{elemIdx=1, counts=[0]}: B matches -> counts=[1]
C1.states = [{elemIdx=1, counts=[1]}]

Phase 3 (Advance):
{elemIdx=1, counts=[1]}: B (min=1, max=1)
count(1) >= max(1) -> unconditional exit
Reset counts[0]=0, next = 2 (FIN)
FIN reached -> matchEndRow = 3, matchedState recorded.
Early termination: no remaining states, so completed immediately.
C1.states = [] (empty after reaching FIN)

C1.states is empty and matchEndRow=3 >= matchStartRow=1 -> match succeeds.

register_reduced_frame_map(1, RF_FRAME_HEAD)
register_reduced_frame_map(2, RF_SKIPPED)
register_reduced_frame_map(3, RF_SKIPPED)

--- Row 4 (price=130) ---

update_reduced_frame(4) called.
C3 was already created but matchStartRow=3, so it is not applicable.
New context C4 created (matchStartRow=4).

nfa_evaluate_row(4):
A: 130 > PREV(115) -> true
B: 130 < PREV(115) -> false

... No subsequent rows, so nfa_finalize_all_contexts() is called.
Match incomplete -> RF_UNMATCHED.

XI-5. Final Result

Row 0: RF_UNMATCHED -> frame = the row itself
Row 1: RF_FRAME_HEAD -> frame = rows 1 through 3
Row 2: RF_SKIPPED -> inside row 1's match
Row 3: RF_SKIPPED -> inside row 1's match
Row 4: RF_UNMATCHED -> frame = the row itself

Chapter XII Summary of Key Design Decisions
================================================================================

XII-1. Flat Array vs Tree-Based NFA

Choice: Flat array (RPRPatternElement[])

Rationale:
- Cache-friendly: 16-byte fixed size, contiguous memory
- Index-based references: 2-byte indices instead of pointers
- Easy to serialize: can use memcpy when passing to plan nodes

XII-2. Forward-only Execution vs Backtracking

Choice: Forward-only (state set tracking)

Rationale:
- Backtracking takes exponential time in the worst case
- NFA simulation guarantees polynomial time
- DFS order naturally guarantees preferment.
Greedy/reluctant per quantifier requires only reversing the DFS order
- Window functions receive sorted rows sequentially.
Forward-only fits directly into this pipeline,
whereas backtracking requires re-fetching previous rows
- DEFINE conditions are SQL expressions (PREV, RUNNING aggregates, etc.)
with high re-evaluation cost. Forward-only requires only one evaluation
per row

XII-3. Per-Context Management

Choice: Independent context per start row

Rationale:
- Supports overlapping matches under SKIP TO NEXT ROW
- Determines the frame for each row independently
- Absorption optimization can eliminate redundant contexts in O(n)

XII-4. Memory Pool Management

Choice: Custom free list

Rationale:
- NFA states are created and destroyed in large numbers per row
- Avoids palloc/pfree overhead
- State size is variable (counts[] array), but within a single query
maxDepth is fixed, so all states have the same size

XII-5. Execution Optimization Summary

The following optimizations make the NFA simulation practical.

-- Compile-time --

(1) AST Optimization (IV-3)

Simplifies the AST before converting the pattern to an NFA.
Reduces the number of NFA elements through consecutive variable
merging (A A -> A{2}), SEQ flattening, quantifier multiplication,
and other transformations.

Significance: Reducing the element count directly shrinks the state
space, decreasing the cost of all subsequent runtime phases (match,
absorb, advance).

-- Runtime: advance phase --

(2) Group Skip (IX-4(b))

At the BEGIN of a group with min=0, uses jump to skip the entire
group. Moves directly to the first element outside the group without
exploring the group body. Greedy enters then skips; Reluctant skips
then enters.

Significance: For optional groups (min=0), immediately generates
a skip path without exploring the body, avoiding unnecessary DFS
expansion.

(3) State Deduplication (IX-5)

During advance, DFS may generate states with the same (elemIdx,
counts) combination through multiple paths. Additionally, unlike
VAR repetition, group repetition cannot perform absorption
comparison using VAR states, so inline advance is performed from
after Phase 1 match through to END; this process can also produce
duplicate states reaching the same END.
nfa_add_state_unique() blocks duplicate addition of identical states
in both cases.

Significance: Prevents exponential growth of the state count in
ALT branches and quantifier expansion. Since DFS order causes the
preferred branch's state to be registered first, identical states
from lower-priority branches are automatically discarded, thereby
also guaranteeing preferment.

(4) Cycle Detection and Fast-Forward (IX-6, IX-4(c))

When a nullable group body (e.g., A?) repeats empty matches,
the END -> BEGIN loop-back can continue indefinitely.

Two mechanisms resolve this:
- A visited bitmap (nfaVisitedElems) blocks revisitation of the
same element, preventing infinite loops (safety)
- At an END with the RPR_ELEM_EMPTY_LOOP flag set, when
count < min, the remaining required iterations are treated as
empty matches and a fast-forward exit path out of the group is
added (correctness)

Significance: Cycle detection guarantees termination, and
fast-forward guarantees that the min condition is satisfied.
Without these, patterns containing nullable groups would fall
into infinite loops or fail to match.

(5) Match Pruning (IX-4(e))

When a state reaches FIN during advance, all remaining unprocessed
states of that context are removed. Because of DFS order, the path
that reaches FIN first has the highest preferment, so the remaining
paths are inferior.

Significance: Once the best match is determined, exploration of
inferior paths is immediately terminated. This mechanism achieves
both preferment guarantees and performance optimization.

-- Runtime: inter-context --

(6) Early Termination (SKIP PAST LAST ROW)

In SKIP PAST LAST ROW mode, when a match is found, subsequent
contexts whose start rows fall within the match range are pruned
immediately without further processing.
In SKIP TO NEXT ROW mode, overlapping contexts are preserved
because each row requires its own independent match.

Significance: Prunes subsequent contexts whose start rows overlap
with a prior match range, avoiding unnecessary processing.

(7) Context Absorption (Chapter VIII)

If an independent context is created for each row, O(n^2) states
accumulate. By exploiting the monotonicity that an earlier-started
context subsumes the states of a later-started context, redundant
contexts are eliminated early.

Absorbability is determined per-element; comparison is performed
only at elements with the RPR_ELEM_ABSORBABLE flag (see IV-5).

Significance: Keeps the number of active contexts at a constant
level, achieving O(n^2) -> O(n) time complexity. Without this,
performance degrades sharply on long partitions.

Appendix A. Key Function Index
================================================================================

Function File Role
----------------------------------------------------------------------
transformRPR parse_rpr.c Parser entry point
transformDefineClause parse_rpr.c DEFINE transformation
collectPatternVariables rpr.c Variable collection
filterDefineClause rpr.c DEFINE filtering
buildRPRPattern rpr.c NFA compilation main
optimizeRPRPattern rpr.c AST optimization
fillRPRPattern rpr.c NFA element
generation
finalizeRPRPattern rpr.c Finalization
computeAbsorbability rpr.c Absorption analysis
update_reduced_frame nodeWindowAgg.c Execution main loop
nfa_start_context nodeWindowAgg.c Context creation
nfa_process_row nodeWindowAgg.c 3-phase processing
nfa_evaluate_row nodeWindowAgg.c DEFINE evaluation
nfa_match nodeWindowAgg.c Phase 1
nfa_absorb_contexts nodeWindowAgg.c Phase 2
nfa_advance nodeWindowAgg.c Phase 3
nfa_advance_state nodeWindowAgg.c Per-state branching
nfa_route_to_elem nodeWindowAgg.c Element routing
nfa_advance_alt nodeWindowAgg.c ALT handling
nfa_advance_begin nodeWindowAgg.c BEGIN handling
nfa_advance_end nodeWindowAgg.c END handling
nfa_advance_var nodeWindowAgg.c VAR handling
nfa_add_state_unique nodeWindowAgg.c Deduplication
nfa_states_covered nodeWindowAgg.c Absorption coverage
check

Appendix B. Data Structure Relationship Diagram
================================================================================

Appendix C. NFA Element Array Examples
================================================================================

C-1. PATTERN (A B C)

idx varId depth min max next jump
------------------------------------------
0 A 0 1 1 1 -1
1 B 0 1 1 2 -1
2 C 0 1 1 3 -1
3 FIN 0 1 1 -1 -1

C-2. PATTERN (A+ B*)

idx varId depth min max next jump flags
-------------------------------------------------
0 A 0 1 INF 1 -1 ABSORBABLE | ABSORBABLE_BRANCH
1 B 0 0 INF 2 -1
2 FIN 0 1 1 -1 -1

Only A+ is the absorption point (Case 1). Once past A,
absorption is permanently disabled for that state.

C-3. PATTERN (A | B | C)

idx varId depth min max next jump
------------------------------------------
0 ALT 0 1 1 1 -1 alternation start
1 A 1 1 1 4 2 branch 1 -> FIN, jump -> branch
2
2 B 1 1 1 4 3 branch 2 -> FIN, jump -> branch
3
3 C 1 1 1 4 -1 branch 3 -> FIN
4 FIN 0 1 1 -1 -1

C-4. PATTERN ((A B)+ C)

idx varId depth min max next jump flags
-----------------------------------------------------------
0 BEGIN 0 1 INF 1 4 ABSORBABLE_BRANCH
1 A 1 1 1 2 -1 ABSORBABLE_BRANCH
2 B 1 1 1 3 -1 ABSORBABLE_BRANCH
3 END 0 1 INF 4 1 ABSORBABLE | ABSORBABLE_BRANCH
4 C 0 1 1 5 -1
5 FIN 0 1 1 -1 -1

Case 2: GROUP+ with {1,1} body VARs. A, B are branches;
END is the absorption point. Compare with C-6 (Case 3).

C-5. PATTERN ((A | B)+? C)

idx varId depth min max next jump flags
---------------------------------------------------
0 BEGIN 0 1 INF 1 5 RELUCTANT, group start
1 ALT 1 1 1 2 -1 alternation start
2 A 2 1 1 4 3 branch 1
3 B 2 1 1 4 -1 branch 2
4 END 0 1 INF 5 1 RELUCTANT, group end
5 C 0 1 1 6 -1
6 FIN 0 1 1 -1 -1

C-6. PATTERN ((A+ B)+ C) -- Absorbability flag example

idx varId depth min max next jump flags
----------------------------------------------------------
0 BEGIN 0 1 INF 1 4 ABSORBABLE_BRANCH, group start
1 A 1 1 INF 2 -1 ABSORBABLE | ABSORBABLE_BRANCH
2 B 1 1 1 3 -1
3 END 0 1 INF 4 1 group end
4 C 0 1 1 5 -1
5 FIN 0 1 1 -1 -1

Recurses from BEGIN into the body -> A matches Case 1 (simple VAR+).
A gets ABSORBABLE | ABSORBABLE_BRANCH, BEGIN gets ABSORBABLE_BRANCH.
B and END get no flags -> absorption stops once the state advances to B.
(See IV-5 Case 3)

C-7. PATTERN ((A+ B | C*)+ D) -- Per-branch absorption in ALT

idx varId depth min max next jump flags
----------------------------------------------------------
0 BEGIN 0 1 INF 1 6 ABSORBABLE_BRANCH
1 ALT 1 1 1 2 -1 ABSORBABLE_BRANCH
2 A 2 1 INF 3 4 ABSORBABLE | ABSORBABLE_BRANCH
3 B 2 1 1 5 -1
4 C 2 0 INF 5 -1 ABSORBABLE | ABSORBABLE_BRANCH
5 END 0 1 INF 6 1 EMPTY_LOOP
6 D 0 1 1 7 -1
7 FIN 0 1 1 -1 -1

ALT branches are checked independently for absorbability.
Branch 1: A+ matches Case 1 -> A gets ABSORBABLE. B has no flag.
Branch 2: C* matches Case 1 -> C gets ABSORBABLE.
Both A and C get ABSORBABLE_BRANCH as part of their respective branch
paths.
END has EMPTY_LOOP: branch 2 (C*) is nullable, making the group body
nullable.
BEGIN and ALT get ABSORBABLE_BRANCH (on the path to absorbable elements).

================================================================================
End of document
================================================================================

Best regards,
Henson

Attachment	Content-Type	Size
RPR_NFA_Algorithm_Guide.txt	text/plain	53.6 KB
RPR_NFA_Algorithm_Guide_KO.txt	text/plain	56.8 KB
RPR_NFA_Algorithm_Guide_JA.txt	text/plain	61.3 KB

In response to

Re: Row pattern recognition at 2026-02-26 13:34:03 from Henson Choi

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2026-02-26 13:52:10	Re: pgstat include expansion
Previous Message	Jim Jones	2026-02-26 13:39:02	Re: COMMENTS are not being copied in CREATE TABLE LIKE