| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | jian(dot)universality(at)gmail(dot)com |
| Cc: | ishii(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-07-02 00:23:28 |
| Message-ID: | CAAAe_zCtsuY0odaocp6PiR6HOPG5CCSzFoOCL3P4cTq93mcEFg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Jian,
Thanks for the review, and for basing it on the RPR branch. Responses
inline.
> ``(only possible during early development/testing).`` comment is not
necessary?
Agreed, and it is actually misleading: varMatched == NULL is a live
signal, not a dev-only artifact. nfa_match() is called with NULL to force
every VAR to not-match at a frame boundary (execRPR.c, "Frame boundary
reached: force mismatch"). I will rewrite the comment to describe that
meaning instead of the "early development/testing" note.
> nfa_match desperately needs an extra i(nt64 currentPos) argument. [...]
> adding currentPos makes it much easier to understand the current
> execution state when debugging with GDB.
Makes sense -- nfa_match() has only the state list in scope, no row
index, which is exactly the thing that is hard to place in GDB. I will
thread currentPos through.
> The comments above in nfa_match are unnecessary because in ``if
> (RPRElemIsAbsorbableBranch(elem)`` we already have many comments, and
> the meaning seems the same.
Agreed -- the loop-top overview's second sentence restates the inline
END-chain block. I will drop it and keep "Evaluate VAR elements against
current row" as a one-line lead.
> Similar to REG_DEBUG, we really need an RPR_DEBUG option. [...] verify
> that ctx->states is not a circular linked list [...] without affecting
> release builds.
Fair point, and there is some history: RPR_DEBUG did exist earlier as
debug logging, and I am the one who proposed removing it. My reason was
narrow, though. I lean on the debugger, and with AI assistance the cost of
inserting, watching, and then removing ad-hoc logs has dropped sharply --
so I treated those logs as ephemeral and optimized for a clean committed
tree rather than for someone reading execRPR.c cold.
For a new contributor that trade-off is the wrong one, and you have
convinced me: a retained set of observation points, plus the Assert-style
sanity checks you describe (non-circular state list, count and
context-ordering invariants), would genuinely lower the barrier to this
code. So I agree with the direction. Rather than settle on a shape now, I
would leave it open: which observation points and checks are actually
worth retaining behind #ifdef RPR_DEBUG can be picked out later as a
separate follow-up, and it is really Tatsuo's call since he suggested the
first version.
> <para> The SQL standard defines more subclauses: MEASURES and SUBSET
> [...] more variations in AFTER MATCH clause. </para>
> This part should stay in the Compatibility section.
Agreed. It is a standard-deviation note, and the SELECT Compatibility
section currently has no row-pattern text, so I will move this paragraph
there.
Unless Tatsuo has a different view by the weekend, I will proceed with the
concrete changes above, leaving the RPR_DEBUG shape open as noted.
Best regards,
Henson
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Henson Choi | 2026-07-02 00:23:30 | Re: Row pattern recognition |
| Previous Message | dogeon yoo | 2026-07-02 00:10:17 | Re: Tighten pg_uhc_verifychar() to enforce CP949 lead/trail byte ranges |