Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: assam258(at)gmail(dot)com
Cc: vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-02-23 10:26:46
Message-ID: 20260223.192646.1862474650359229532.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Henson,

> Hi Tatsuo,
>
> Here are two incremental patches on top of v43 + our previous two.

I have tried v43 + those patches and it passed the regression test.

> nocfbot-0003: Fix ALT lexical ordering via DFS early termination
>
> The altPriority FIXME turned out to have a simple solution. The key
> insight is that the NFA state list is already in lexical order from
> DFS traversal, so when a state reaches FIN, all remaining states in
> the list have worse lexical priority and can be pruned immediately.
>
> This makes the altPriority field unnecessary -- DFS traversal order
> itself guarantees correct lexical ordering. The patch removes
> altPriority from RPRNFAState entirely and adds early termination
> in nfa_advance(): when a new FIN is reached, the remaining states
> are freed and processing stops.
>
> This also implements the state pruning optimization I mentioned
> earlier -- it falls out naturally from the same mechanism.
>
> Changes:
> - Remove altPriority field from RPRNFAState and all call sites
> - Add early termination in nfa_advance() on new FIN arrival
> - Simplify nfa_add_matched_state() to unconditional replacement
> - Fix outer END count increment for quantified VARs in
> nfa_advance_var()
> - Remove FIXME 1 test cases, add test_alt_lexical_order test
> - Add EXPLAIN ANALYZE test verifying early termination statistics

Sounds good.

> nocfbot-0004: Implement reluctant quantifiers
>
> With the DFS early termination infrastructure from nocfbot-0003,
> reluctant quantifiers become straightforward: reverse the DFS
> traversal order so that shorter matches are explored first, then
> apply the same early termination to stop at the shortest match.
>
> Greedy explores enter/loop before skip/exit; reluctant reverses
> this to skip/exit before enter/loop. When the preferred (shorter)
> path reaches FIN, the longer path is pruned immediately.
>
> Changes:
> - Remove parser error rejecting reluctant quantifier syntax
> - Extend tryUnwrapGroup() to propagate reluctant flag when
> unwrapping single-child groups: (A)+? -> A+?
> - Add reluctant branching in nfa_advance_begin(),
> nfa_advance_end(), and nfa_advance_var() with per-branch
> early termination
> - Add tests in rpr_base.sql, rpr_nfa.sql, and rpr.sql covering
> basic reluctant semantics, quantifier boundaries, interaction
> with greedy quantifiers, and nested/alternation combinations

This is really good news!

>> I found reluctant quantifiers are useful but also I don't want to make
>> patch sets far bigger. How do you estimate the difficulty and the size
>> of the code for reluctant quantifiers?
>
> You asked earlier how I estimated the difficulty of reluctant
> quantifiers. It turned out the answer was subtraction, not
> addition -- removing altPriority and simplifying the match logic
> first made reluctant quantifiers almost trivial to add on top.

Glad to hear that you found simpler solution to implement reluchtant
quantifiers.

> Next I plan to work on the remaining FIXME (cycle prevention for
> patterns like (A*)*), A{0} implementation, and test reorganization.

Ok. Looking forward to the next patches.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2026-02-23 10:38:01 Re: refactor architecture-specific popcount code
Previous Message jian he 2026-02-23 10:20:35 Re: NOT NULL NOT ENFORCED