Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-02-22 11:09:52
Message-ID: CAAAe_zDcrF9_m5FnWFgdrCpxjTqdH2ZMDmpKXevMG4KBHJsw4w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tatsuo,

Here are two incremental patches on top of v43 + our previous two.

nocfbot-0003: Fix ALT lexical ordering via DFS early termination

The altPriority FIXME turned out to have a simple solution. The key
insight is that the NFA state list is already in lexical order from
DFS traversal, so when a state reaches FIN, all remaining states in
the list have worse lexical priority and can be pruned immediately.

This makes the altPriority field unnecessary -- DFS traversal order
itself guarantees correct lexical ordering. The patch removes
altPriority from RPRNFAState entirely and adds early termination
in nfa_advance(): when a new FIN is reached, the remaining states
are freed and processing stops.

This also implements the state pruning optimization I mentioned
earlier -- it falls out naturally from the same mechanism.

Changes:
- Remove altPriority field from RPRNFAState and all call sites
- Add early termination in nfa_advance() on new FIN arrival
- Simplify nfa_add_matched_state() to unconditional replacement
- Fix outer END count increment for quantified VARs in
nfa_advance_var()
- Remove FIXME 1 test cases, add test_alt_lexical_order test
- Add EXPLAIN ANALYZE test verifying early termination statistics

nocfbot-0004: Implement reluctant quantifiers

With the DFS early termination infrastructure from nocfbot-0003,
reluctant quantifiers become straightforward: reverse the DFS
traversal order so that shorter matches are explored first, then
apply the same early termination to stop at the shortest match.

Greedy explores enter/loop before skip/exit; reluctant reverses
this to skip/exit before enter/loop. When the preferred (shorter)
path reaches FIN, the longer path is pruned immediately.

Changes:
- Remove parser error rejecting reluctant quantifier syntax
- Extend tryUnwrapGroup() to propagate reluctant flag when
unwrapping single-child groups: (A)+? -> A+?
- Add reluctant branching in nfa_advance_begin(),
nfa_advance_end(), and nfa_advance_var() with per-branch
early termination
- Add tests in rpr_base.sql, rpr_nfa.sql, and rpr.sql covering
basic reluctant semantics, quantifier boundaries, interaction
with greedy quantifiers, and nested/alternation combinations

> I found reluctant quantifiers are useful but also I don't want to make
> patch sets far bigger. How do you estimate the difficulty and the size
> of the code for reluctant quantifiers?

You asked earlier how I estimated the difficulty of reluctant
quantifiers. It turned out the answer was subtraction, not
addition -- removing altPriority and simplifying the match logic
first made reluctant quantifiers almost trivial to add on top.

Next I plan to work on the remaining FIXME (cycle prevention for
patterns like (A*)*), A{0} implementation, and test reorganization.

Best regards,
Henson

>

Attachment Content-Type Size
nocfbot-0003-DFS-early-termination.patch application/octet-stream 23.8 KB
nocfbot-0004-Implement-reluctant-quantifiers.patch application/octet-stream 65.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2026-02-22 11:10:57 Re: Comment for UserMappingPasswordRequired in contrib/postgres_fdw
Previous Message Fabrice Chapuis 2026-02-22 10:16:50 record with incorrect prev-link