| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | jian he <jian(dot)universality(at)gmail(dot)com> |
| Cc: | Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-06-24 06:04:09 |
| Message-ID: | CAAAe_zC71Lv4UFyMBhoK0gAAUfJ97-7fbj0cVShQDYryWCYqqg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi jian,
Thanks for the refactor. I went through it change by change -- the
per-file notes are at the bottom.
A note up front, so the notes don't read as dismissive: at this stage I'd
rather prioritize validated code over better code. The branch has a lot of
coverage and Valgrind behind it, and every change -- even a clean,
equivalent one -- moves a line out from under that. So where I say "drop",
it means "not now", not "not good"; most of these would be welcome as a
follow-up. One piece, though, does need a fix before any of it lands.
That piece is the new guard in computeAbsorbability(): it carries a
behavior change that no test covers.
> Move isAbsorbable initialization from finalizeRPRPattern() into
> makeRPRPattern(), and guard the computeAbsorbabilityRecursive() call in
> computeAbsorbability() so it is only entered when the first element is
> an ALT or has an unbounded quantifier.
First, what "absorbable" means here: the first unbounded repetition of a
fixed-length body reachable from the start of the pattern. This rule is
worth writing down -- in README.rpr and in the comment on
computeAbsorbability() (or wherever fits best) -- so it is stated once,
explicitly (patch attached: nocfbot-absorb-definition-doc).
By that definition the leading A+ in (A+ B){2,3} is absorbable, but the
patch drops it.
The offending check is in computeAbsorbability():
if (RPRElemIsAlt(&pattern->elements[0]) ||
pattern->elements[0].max == RPR_QUANTITY_INF)
computeAbsorbabilityRecursive(pattern, 0, &hasAbsorbable);
It looks only at elements[0], so a bounded outer group's BEGIN (finite max)
fails the check and the recursion that would descend to A+ never runs.
I'm attaching a small test for this (nocfbot-absorb-bounded-outer-test): it
passes on the branch and fails with v50-0001 applied, pinning both the a+"
marker and the absorbed-context count.
For the rest, change by change -- "ok" = take as-is, "drop" = leave out:
src/backend/optimizer/plan/rpr.c
The branch is already validated under coverage and Valgrind, so I'd rather
not re-open verified logic for equivalent refactors -- glad to take those
as a follow-up. In file order:
ok palloc_array / palloc0_array in makeRPRPattern()
result->varNames = palloc_array(char *, numVars);
result->elements = palloc0_array(RPRPatternElement, numElements);
ok move the isAbsorbable = false init into makeRPRPattern()
drop rename beginElem -> elem in fillRPRPatternGroup()
only beginElem was renamed; with endElem kept, the begin/end pair is
no longer symmetric (elem / endElem)
drop rename isFixedLengthChildren -> isFixedQuantifier -- keep the old
name
"fixed-length" is the term everywhere else (README.rpr, the nearby
comments, even this function's own body), and it drops the
"children"
that flags a whole-subtree check
drop the jump-1 traversal rewrite in that function -- equivalent
refactor,
defer to a follow-up
drop rename to start_with_unbounded_quantifier -- keep isUnboundedStart
Lone snake_case among camelCase neighbors -- including
isFixedQuantifier,
renamed in this same patch -- and it reads as a predicate though it
sets flags.
drop start_with_*'s new signature + if/else restructure -- equivalent
refactor; keep the original isUnboundedStart body
drop rename branchFirst -> branchElement -- cosmetic churn, no behavior
change
drop the new computeAbsorbability() guard -- the regression above
if (RPRElemIsAlt(&pattern->elements[0]) ||
pattern->elements[0].max == RPR_QUANTITY_INF)
src/include/nodes/plannodes.h
ok drop the redundant /* T_RPRPattern */ on the type field
NodeTag type;
drop reworded "see isUnboundedStart" block comment -- keep the original
(it still matches now that isUnboundedStart stays)
ok drop the inline isAbsorbable comment -- the block comment above
covers it
bool isAbsorbable;
src/backend/executor/README.rpr
ok spell out the skip mode -- a clear doc fix on its own
(1) SKIP PAST LAST ROW (not SKIP TO NEXT ROW)
drop isUnboundedStart -> start_with_unbounded_quantifier -- follows the
rename
Tatsuo, what's your read on all this -- in particular, hold the equivalent
refactors until after commit, or take them now?
Best regards,
Henson
| Attachment | Content-Type | Size |
|---|---|---|
| nocfbot-absorb-bounded-outer-test.txt | text/plain | 3.8 KB |
| nocfbot-absorb-definition-doc.txt | text/plain | 1.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-06-24 06:04:27 | Re: Support EXCEPT for ALL SEQUENCES publications |
| Previous Message | Bharath Rupireddy | 2026-06-24 05:58:47 | Re: Add MIN/MAX aggregate support for uuid |