| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Cc: | zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-03-29 12:13:47 |
| Message-ID: | CAAAe_zBCF3dwSjStmG0kJqw_y1z8QD73Rf1G58QTKEvd9tScwA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Tatsuo,
Attached are 6 incremental patches on top of v46.
0001-0005 are small independent fixes. 0006 is the main
PREV/NEXT navigation redesign that was discussed as the
"experimental" implementation in prior messages.
Here is a summary of each patch:
0001: Remove unused regex/regex.h include from nodeWindowAgg.c
The regex header was left over from an earlier implementation
that used the regex engine for pattern matching. The current
NFA engine in execRPR.c does not use it.
0002: Add CHECK_FOR_INTERRUPTS() to nfa_add_state_unique()
This function iterates through a linked list of NFA states
to check for duplicates. In state explosion scenarios
(complex patterns with alternations and quantifiers), this
list can grow significantly. Without an interrupt check,
the loop is unresponsive to query cancellation.
0003: Add CHECK_FOR_INTERRUPTS() to nfa_try_absorb_context()
Similar to 0002. The absorption loop iterates through the
context chain, which can grow unbounded with streaming
window patterns. Each iteration also calls
nfa_states_covered() which has its own loop.
0004: Fix in-place modification of defineClause TargetEntry
In set_upper_references(), the defineClause TargetEntry
was modified in-place by fix_upper_expr() without making
a copy first. This follows the same flatCopyTargetEntry()
pattern already used for the main targetlist in the same
function.
0005: Fix mark handling for last_value() under RPR
This is a revised version of the v45 0004 patch that you
commented on [1]. You pointed out that suppressing
WinSetMarkPosition() entirely under RPR was too strong a
limitation, and we agreed to drop it and revisit together
with the experimental PREV/NEXT patch.
The RPR executor patch changed set_mark from true to
false in window_last_value() to avoid mark advancement
problems under RPR. But this also penalizes non-RPR
queries by preventing tuplestore memory reclamation.
The revised approach: restore set_mark=true (upstream
default), and add a targeted guard in
WinGetFuncArgInFrame() that only suppresses mark
advancement for SEEK_TAIL under RPR -- not for all
seek types as in v45 0004. Only SEEK_TAIL needs
suppression because advancing the mark from the tail
could prevent revisiting earlier rows that still fall
within a future row's reduced frame.
[1]
https://www.postgresql.org/message-id/20260321.140232.1947157589839395257.ishii%40postgresql.org
0006: Implement 1-slot PREV/NEXT navigation for RPR
This is the main patch. It redesigns PREV/NEXT navigation
from the 3-slot model (outer/scan/inner with varno
rewriting) to a 1-slot model using expression opcodes.
Key changes:
- RPRNavExpr: a new expression node type that replaces
the previous approach of identifying PREV/NEXT by
funcid. The parser transforms PREV/NEXT function calls
into RPRNavExpr nodes in ParseFuncOrColumn().
- EEOP_RPR_NAV_SET/RESTORE: two new opcodes that
temporarily swap ecxt_outertuple to the target row
during expression evaluation. The argument expression
evaluates against the swapped slot, then the original
slot is restored. This eliminates varno rewriting and
naturally supports arbitrary offsets.
- 2-argument form: PREV(value, offset) and
NEXT(value, offset) are now supported. The offset
defaults to 1 if omitted; offset=0 refers to the
current row. The offset must be a run-time constant
per the SQL standard.
- nav_winobj: a dedicated WindowObject with its own
tuplestore read pointer, separate from aggregate
processing. A mark pointer pinned at position 0
prevents tuplestore truncation so that PREV(expr, N)
can reach any prior row. This is conservative -- it
retains the entire partition -- but since the standard
requires offsets to be run-time constants, we could
advance the mark to (currentpos - max_offset) in a
future optimization. I'd prefer to defer that until
the basic navigation is stable.
- Documentation and tests updated.
Changes from the experimental version of 0006:
- NULL/negative offset error rationale changed from "matching
Oracle behavior" to the SQL standard (ISO/IEC 9075-2,
Subclause 5.6.2: "There is an exception if the value of
the offset is negative or null"). The behavior is the same;
only the comment was corrected.
- RPRNavKind changed from -1/+1 values to plain enum values.
The previous version used kind as an arithmetic multiplier
(offset * kind), but this pattern cannot extend to FIRST/LAST
which have different position semantics. The new version uses
a switch statement with pg_sub/add_s64_overflow instead.
- Parser validation walkers consolidated from three separate
walkers into a single NavCheckResult struct + nav_check_walker,
reducing tree traversals from up to 4 per RPRNavExpr to 1-2.
- JIT changed from attempting to compile to explicit interpreter
fallback. See item 2 below for the rationale.
- Additional test cases for subquery offset (caught by
DEFINE-level restriction), first-arg subquery, and first-arg
volatile function (allowed per standard).
I've reviewed and revised 0006 since the experimental version.
I think it is now ready to be included in the next version of
the patch set. It passes all existing regression tests (rpr,
rpr_base, rpr_explain, rpr_nfa) and adds comprehensive tests
for the new functionality.
Two items I'd like your opinion on:
1. 0005 (mark handling): The revised approach only suppresses
mark advancement for SEEK_TAIL under RPR, unlike v45 0004
which suppressed it for all seek types. Does narrowing to
SEEK_TAIL seem reasonable to you, or do you see cases where
other seek types could also be affected?
2. LLVM JIT fallback (in 0006): The mid-expression slot swap
conflicts with how JIT caches tuple data pointers. The
experimental version produced wrong results under
jit_above_cost=0, and I have not found a clean fix within
the current JIT framework. For now, DEFINE expressions
containing PREV/NEXT fall back to the interpreter; other
expressions in the same query are still JIT-compiled.
I'd like to keep this as-is for now and look for a proper
JIT solution over time. Does that sound reasonable?
Best regards,
Henson
| Attachment | Content-Type | Size |
|---|---|---|
| nocfbot-0001-Remove-unused-regex-regex.h-include-from-nodeWindowA.txt | text/plain | 764 bytes |
| nocfbot-0002-Add-CHECK_FOR_INTERRUPTS-to-nfa_add_state_unique-for.txt | text/plain | 830 bytes |
| nocfbot-0003-Add-CHECK_FOR_INTERRUPTS-to-nfa_try_absorb_context-l.txt | text/plain | 864 bytes |
| nocfbot-0004-Fix-in-place-modification-of-defineClause-TargetEntr.txt | text/plain | 1.0 KB |
| nocfbot-0005-Fix-mark-handling-for-last_value-under-RPR.txt | text/plain | 1.8 KB |
| nocfbot-0006-Implement-1-slot-PREV-NEXT-navigation-for-RPR.txt | text/plain | 90.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2026-03-29 12:16:47 | Re: Compress prune/freeze records with Delta Frame of Reference algorithm |
| Previous Message | Evgeny Voropaev | 2026-03-29 11:47:09 | Re: Compress prune/freeze records with Delta Frame of Reference algorithm |