| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | jian he <jian(dot)universality(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Cc: | zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-05-27 05:49:44 |
| Message-ID: | CAAAe_zDephfiDA_A3FN0hCymJRogEr=Rt3QoCTf4qMYDLk+xNA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi jian, Tatsuo,
Thanks for the thorough first read of execRPR.c and the NFA. The
depth of understanding you reached on a fairly intricate body of
code in a short window is impressive -- and several of the open
questions in your review pointed straight at documentation and
comment debt on our side that needed exposing. First-reviewer
feedback of that kind is especially valuable, because the gaps it
surfaces are exactly the ones the next reader would otherwise have
to re-discover; closing them now puts everyone reading the code
afterwards on firmer ground. Would be glad to keep working together
on this area.
> Disclaimer: I had some private off-list discussions with Henson Choi
> regarding execRPR.c, and the NFA.
> The following are more formal comments regarding execRPR.c and the NFA
> algorithm, based on v47-0001 to v47-0009, I haven't applied the
> incremental diff yet.
> (Since the incremental diffs are scattered across several threads, a
> unified v48 would be better).
> (I'm still wrapping my head around the NFA in execRPR.c, so take the
> comments below with a grain of salt.)
Agreed on the unified v48 -- both the scattered incrementals and
the responses to your review will be part of that series. The
responses below will land alongside their corresponding change
(README, comment, test, or commit) in that series rather than as
a free-standing reply that defers the actual fixes.
> PATTERN (A B C D)
> We can short-circuit and exit early if any of the evaluations (A, B,
> or C) fail in nfa_match.
> This is necessary since the chance of a pattern element evaluation
> returning false is not rare, I think.
Agreed on the optimization intent. A slightly different shape worth
considering: rather than eager short-circuit in nfa_match, defer
DEFINE predicate evaluation to first use -- varMatched[i] computed
on demand and cached per row, so pruned paths never pay for
predicates they didn't reach. The code change is small; the part
needing care is whether "not evaluating" is safe for every predicate
(navigation state, slot setup, side effects). The current code or
the standard may already enforce constraints that make lazy
evaluation naturally safe, but I'd rather not act on that assumption
without concrete grounding in what the code actually enforces or in
test coverage that pins it down.
Would you be interested in driving the discussion in this thread?
You've been deep in execRPR.c recently, so the trade-offs sit
closest to you, and I think Tatsuo will land a good conclusion once
they're on the table. Happy to support the discussion in whatever
way is useful as it converges.
> For src/backend/executor/README.rpr:
> We should explicitly explain 'absorbable' and 'absorption' somewhere in
> README.rpr, as the text currently just assumes the reader knows what they
mean.
> Using some example illustrate "absorption" meaning, put it on
> README.rpr would be great.
> We can also mention that 'DFS' refers to Depth-First Search".
Acknowledged, and the request surfaced an underlying problem in the
README's terminology. "Absorption" is currently used for two
distinct things: an AST-level rewrite in Phase 1 that pulls
identical sequences around a group inside it, and the runtime
context-equivalence collapse that drives the O(n^2) -> O(n)
optimization. Sharing the word leaves a reader encountering
"absorbable" early on without an anchor.
Rather than disambiguate by qualifier ("prefix/suffix absorption"
vs "context absorption"), I'd lean toward renaming the AST-level
case so "absorption" stays reserved for the runtime concept. The
README then only needs to explain absorption in one place, in
detail, without the disambig preamble.
For the rename, "prefix/suffix merging" feels like the natural fit
-- the other AST-level optimizations in the same Phase 1 are already
named "consecutive variable / group / ALT merging", so it slots in
cleanly. "Prefix/suffix factoring" is another candidate if a more
descriptive verb is preferred.
Tatsuo, curious what you think of this direction and naming. Happy
to take any name you prefer for the AST-level operation, or to keep
the original "absorption" wording with stronger forward-references
if you'd rather not rename.
For the absorption explanation itself in README.rpr, the diagnosis
I'd offer is that Chapter VIII already carries the necessary content
-- the issue is narrative order. VIII-1 leads with the O(n^2) problem
framing, so a reader meets the cost shape before meeting the
intuition for why absorption is possible, and has to carry the
problem until VIII-2's monotonicity argument finally lands. Beyond
that, VIII-2 stays abstract; there is no row-by-row trace showing
two states being judged equivalent.
Two small additions seem to close most of the gap:
(A) A 4-5 line intuition summary at the top of Chapter VIII,
before VIII-1, naming what absorption is (collapsing contexts
that have converged on identical future behavior) and the
monotonicity principle that makes it safe. This gives the
reader an anchor before the problem framing.
(B) A short worked example at the end of VIII-2: a PATTERN (A+)
trace over a few rows showing each new context being absorbed
by Context_1 once its (elemIdx, depth-0 count) is dominated.
Concrete state/count comparisons make the abstract solution
land.
Curious if this read of the gap matches what tripped you up, and
whether (A) + (B) feel sufficient. Happy to draft both as part of
the v48 README changes.
For DFS, will expand it to "Depth-First Search (DFS)" at the first
occurrence.
> ``````
> (4) Call nfa_advance(initialAdvance=true)
> ``````
> In V47, the variable `initialAdvance` does not exist.
Leftover from an earlier patch version -- the boolean parameter was
refactored away and the README notation wasn't updated. I'll bring
it in line with the current signature.
> In nfa_advance_var, after the first Assert, we can add:
> Assert(elem->next < pattern->numElements);
Agreed. Will add it right after Assert(canLoop || canExit) in
nfa_advance_var, with a >= 0 lower bound tacked on while there
(RPRElemIdx is signed int16, INVALID = -1):
Assert(elem->next >= 0 && elem->next < pattern->numElements);
> ExecRPRFinalizeAllContexts seems unnecessary; I commented it out,
> rerun the regress tests
> (TESTS='test_setup rpr_base rpr_nfa rpr_explain rpr_integration rpr'
> meson test -C $BUILD3 --num-processes 20 --suite regress --verbose)
> Only two SQL tests in rpr_explain.sql failed.
Reproduced this. You're right on correctness and memory: data rows
are identical with the call removed, and release_partition's
MemoryContextReset reclaims memory anyway.
Finalize isn't really about handling matches. By the time the
partition ends, all genuine FIN reaches have already been recorded
in-flight. Its job is to kill any VAR states still pursuing when
rows run out, so cleanup sees a uniform ctx->states == NULL across
every context. Three shapes survive there:
- Pure pursuit (no matchedState, e.g., A+ B mid-pattern).
- Empty-match candidate + pursuit (matchedState set with
matchEndRow < matchStartRow -- e.g., greedy A* with no
successful matches yet, while VAR is still chasing a longer
non-empty one).
- Real match + pursuit (matchedState set with matchEndRow >=
matchStartRow -- e.g., greedy A* with some matches recorded
and still looping for a longer one).
The first two get reclassified as failures by cleanup; without
Finalize they linger without contributing to stats. The third is
stat-neutral -- cleanup skips it either way -- but goes through
the same uniform path so partition-end classification stays
centralized.
The classification surfaces today only via rpr_explain stats, but
becomes user-visible once we extend the R020 surface or move into
R010 -- MEASURES and eventual R010 hooks count matches based on
this classification. Worth locking in now, and an explicit
partition-end stage is structurally cleaner than scattering the
logic.
Plan: keep the call and reframe it as the partition-end
classification policy holder. Strategically, that gives the future
partition-end hooks a single anchor to extend, instead of growing
scattered end-of-partition paths.
> SELECT first_value(id) OVER w AS match_start FROM stock_ticks
> WINDOW w AS ( ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED
> FOLLOWING AFTER MATCH SKIP PAST LAST ROW PATTERN ((A B) {2}) DEFINE A
> AS price < 100, B AS price < 100);
>
> The query above invokes the following code. Since the PATTERN above is
> not greedy, is the comment below incorrect?
> ``````
> else
> {
> /* Greedy: enter first, skip second */
> ...
> }
> ``````
The comment is misleading; the code is correct. Two pieces:
First, greedy vs reluctant: '?' plays two distinct roles in our
grammar. As a quantifier on its own (A?, (A B)?) it means "optional"
-- equivalent to {0,1}. As a suffix on another quantifier (A+?,
(A B)*?, {n}?, etc.) it makes that quantifier reluctant. So {n}
without a trailing '?' is greedy; the {n}? form is reluctant.
PATTERN ((A B){2}) is in fact greedy.
Second, why the branch is entered: nfa_advance_begin's else arm
handles two cases at once:
(a) Greedy with optional group: skipState != NULL, not reluctant.
"Enter the group; also create the skip path."
(b) Non-nullable group (min > 0, regardless of greedy/reluctant):
skipState stays NULL, the outer guard
"if (skipState != NULL && RPRElemIsReluctant(elem))" falls
through, and the inner "if (skipState != NULL)" prevents the
skip-path action from running.
(A B){2} has min = max = 2, so it lands in (b) -- the action that
actually runs is "enter the group", no skip path. The current label
only describes (a), which is why it reads wrong for your test query.
Plan is to rewrite the comment along the lines of
"Greedy-or-non-nullable: route to the first child; for optional
groups (skipState != NULL), additionally create the skip path."
> nfa_advance_var
> ```
> else if (canExit)
> {
> ...
> }
> ```
> The above ELSE IF overrides all RPRNFAState field values except
> RPRNFAState->next.
> Should we set RPRNFAState->next to NULL?
> (If I add ``state->next = NULL;`` in the above ELSE IF branch, all the
> regress tests still pass)
Good catch on the asymmetry. Tracing it through, state->next is
actually already NULL at every branch you flagged: nfa_advance
resets it just before crossing into nfa_advance_state, and the
intermediate branches don't disturb it. Your experiment passing
with an added "state->next = NULL" is consistent with that -- the
assignment is redundant rather than load-bearing.
The contract that keeps state->next sane lives at two concentrated
points (nfa_advance entry, nfa_add_state_unique linking), and the
branches in between are pass-through. Sprinkling the same reset at
every branch would be defensive noise rather than a real safety
net, so I'd leave the branches alone.
Happy to add a short comment near nfa_advance's reset marking it
as the boundary contract, so the next reader doesn't trip on the
same question.
> For function nfa_advance_var, I don't understand the meaning of the
> variable "count", after the first Assert I have added below:
> ...
> Rerunning the regress tests shows that count >= 3 occurs very
infrequently.
> ...
> Can we add more complex queries (more count >= 3) to check if the
> "count" variable is working correctly?
The "count" semantics will read more cleanly once the absorption
README work above lands (counts[d] = iteration count at nesting
depth d). For coverage I'll add a nested reluctant quantifier to
rpr_nfa (e.g., PATTERN ((A B){3,5}? C)) to drive count through the
3..5 band repeatedly. (rpr_nfa is the suite that already targets
Quantifier Runtime Behavior and Absorption Optimization.)
> In function nfa_add_state_unique:
> /* Mark VAR in visited before duplicate check to prevent DFS loops */
> ...
> I honestly don't understand the purpose of the code block above. But it
doesn't
> seem to influence the subsequent FOR LOOP;
> ...
> Could we add some comments explaining which external functions rely on
> this code and why it belongs in nfa_add_state_unique?
The code is correct, but the contract is split across two functions
and currently only one side points to the other. The visited marking
scheme is asymmetric on purpose:
- Non-VAR elements (END/ALT/BEGIN/FIN) are marked on entry to
nfa_advance_state because epsilon cycles must be prevented
immediately.
- VAR elements are marked later, in nfa_add_state_unique, only
when added to the state list. That delay is intentional: it
keeps legitimate quantifier loop-back to the same VAR across
iterations possible.
The paired cycle check sits in nfa_advance_state, and the
asymmetric-marking rationale is documented there. What's missing is
the back-reference from nfa_add_state_unique. I'll add a single line
at the marking site pointing back to nfa_advance_state.
> nfa_states_equal
> compareDepth = elem->depth + 1; /* depth 0 needs 1 count, etc. */
> The comment above isn't helpful, IMHO, and I don't understand it.
> We should focus on why compareDepth should be ```elem->depth + 1```.
Agree the trailing comment is too terse. Two pieces are missing:
(a) The +1 arithmetic: to compare counts up to depth N, we need
slots counts[0..N], which is N+1 entries.
(b) Why deeper slots are excluded: counts[d > elem->depth] are
scratch state from deeper groups and get reset on re-entry,
so they must not participate in equivalence judgment.
Two states sharing elemIdx are equivalent iff all
enclosing-or-current depth counts match. I'll replace the trailing
comment with a small block covering both pieces.
> function nfa_add_state_unique return value is not being used?
> Do we need to do something with the return value, or is this expected?
> (I don't have an opinion on it, I guess it would be better to raise this
issue)
Leftover from an earlier design -- the duplicate case is fully
handled inside the function (the state is freed and nfaStatesMerged
is incremented), so callers have nothing to branch on, and indeed
none of them do. Will change the signature from bool to void and
drop the return statements.
> In nfa_advance_alt, during the main WHILE loop, I think altElem->depth
> must be larger than elem->depth.
> Therefore we can do
> ``````
> if (altElem->depth == elem->depth)
> elog(ERROR, "nfa_advance_alt altElem->depth should not be
> the same as elem->depth reached");
> if (altElem->depth < elem->depth)
> break;
> ``````
I had to push back on this one. Tracing the depth bookkeeping:
- For an ALT at depth D, branches sit at depth D+1, and each
branch's first element has .jump pointing to the next branch's
first (set in fillRPRPatternAlt). So the walk normally
terminates when the last branch's .jump = INVALID -- the depth
check doesn't fire at all.
- But when the last branch is a quantified group, its first
element is a BEGIN whose .jump = past-END (set by
fillRPRPatternGroup and not overridden for the last branch).
The walk then steps to a post-ALT element, and the depth check
is what stops it from creating a stray state out there.
That post-ALT element has depth <= D:
* D-1 if the ALT is inside an enclosing group with a non-trivial
quantifier, e.g., PATTERN ((A | (B C)+){2}) -- post-ALT lands
on the outer END at depth 0, ALT at depth 1. (A {1,1} outer
wrap gets removed by single-child unwrap, so it has to be a
real quantifier.)
* D if the ALT has a sibling at the same level, e.g.,
PATTERN (A | (B C)+) at top level -- post-ALT is FIN at depth 0,
matching the ALT's depth 0.
So "altElem->depth == elem->depth" is a legitimate end-of-walk
signal for the quantified-group-last-branch case, not an invariant
violation. Treating it as an error would misfire on patterns like
A | (B C)+. The current "if (altElem->depth <= elem->depth) break;"
in nfa_advance_alt is intentionally <= and not <, and the looser
comparison is correct. Happy to add a brief comment there noting
the trigger condition, if it would help future readers.
Summary of decisions, in the order above:
Short-circuit optimization Separate series -- invite you to drive
Absorption README narrative Accept -- Chapter VIII summary + example
AST-level "absorption" rename Pending Tatsuo's call -- prefix/suffix
merging?
DFS expansion Accept
initialAdvance README mismatch Accept -- align with current signature
Defensive Assert in advance_var Accept -- also add lower bound
Finalize unnecessary? Keep -- partition-end policy holder
Greedy comment label Accept -- rewrite to cover both cases
state->next reset Decline -- boundary contract covers it
count >= 3 test coverage Accept -- add to rpr_nfa
visited marking purpose Accept -- add back-reference comment
compareDepth comment Accept -- rewrite with intent
Unused bool return Accept -- change to void
ALT depth invariant Assert Decline -- end-of-walk signal, not
invariant
That's the full pass. The actual patches (nocfbot-0016 onward) will
follow shortly as a separate submission, for another review round.
Thanks,
Henson
| From | Date | Subject | |
|---|---|---|---|
| Next Message | vellaipandiyan sm | 2026-05-27 05:54:06 | Re: Can we use Statistics Import and Export feature to perforamance testing? |
| Previous Message | Hayato Kuroda (Fujitsu) | 2026-05-27 05:43:15 | RE: 035_standby_logical_decoding might fail due to FATAL message lost inside libpq |