Quick Links

Re: Row pattern recognition

From:	Henson Choi <assam258(at)gmail(dot)com>
To:	Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc:	zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Row pattern recognition
Date:	2026-05-05 04:26:56
Message-ID:	CAAAe_zCwY4Lt_OS44DMuqua-szeDVVTgJDeH=c-d5Qco5v2sOQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Tatsuo,

Thanks for the close look. Inline below, then a sketch of what I
plan to put in v48 so you can pick what to pull first.

v47 starts to check whether range variable qualified expressions are
> used in DEFINE clause. If used, raise an error. This is good because
> we don't support the syntax yet (pattern variable range var), or it's
> prohibited (from clause range var). However, the error message may not
> be appropreate for the case when complex data type is involved.
>
> CREATE TEMP TABLE item (name TEXT, amount INT);
> CREATE TABLE
> CREATE TEMP TABLE sales(items item);
> CREATE TABLE
> SELECT (items).name, (items).amount, count(*) OVER w
> FROM sales WINDOW w AS (
> ORDER BY (items).name
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> AFTER MATCH SKIP PAST LAST ROW
> PATTERN (A+)
> DEFINE A AS (A.items).amount > 10
> );
> ERROR: pattern variable qualified column reference "a.items" is not
> supported in DEFINE clause
> LINE 7: DEFINE A AS (A.items).amount > 10
> ^
> If I change the DEFINE clause to:
>
> DEFINE A AS (sales.items).amount > 10
>
> I get:
>
> ERROR: range variable qualified column reference "sales.items" is not
> allowed in DEFINE clause
> LINE 8: DEFINE A AS (sales.items).amount > 10
> ^
>
> In both cases, "a.items" or "sales.items" in the error messages are
> not column names, therefore the wording "column reference" in the
> error messages are not appropriate.
>

Agreed. Even for the simple non-composite cases the noun is fuzzy
once A_Indirection enters the picture. "expression" reads naturally
in both cases.

> In order to fix the issue, I think we need to add code to understand
> range var qualification form "A.item" or "sales.items" in complex data
> types case. But is it worth the trouble? The reword is just nicer
> error messages. If we support MEASURES, the code is useful. But so far
> we have decided to not support it in the first cut of RPR.
>

Not now. Teaching the pre-check about the surrounding A_Indirection
duplicates work that MEASURES will need anyway, and the only visible
gain before MEASURES is a more accurate echo of the source text.

Instead I'll mark the limitation in three places so future MEASURES
work can't miss it: a block comment above the pre-check that names
the limitation and the revisit point, an XXX cross-reference in
parse_rpr.c pointing back to the pre-check, and composite-type cases
in rpr_base whose expected output quotes only the ColumnRef portion
-- so when indirection-aware quoting lands, those outputs churn as
a tripwire.

> Maybe we should just change the error messages something like below
> for now?
>
> ERROR: pattern variable qualified expression "a.items" is not supported
> in DEFINE clause
> ERROR: range variable qualified expression "sales.items" is not allowed
> in DEFINE clause

Yes -- planned for v48. While I'm in that path I'll also clean up
two adjacent issues:

- the pre-check is binary, so an unknown qualifier (typo,
undefined name) is misreported as a range-variable reference
with SYNTAX_ERROR instead of falling through to the standard
"column does not exist" / "missing FROM-clause entry"
diagnostic;

- parse_rpr.c and the SELECT docs claim DEFINE-name "collection"
and "filtering during planning"; the actual behavior is
validate-and-reject at parse analysis.

------------------------------------------------------------------
v48 follow-up plan
------------------------------------------------------------------

The items below are independent, so they can ship as separate
patches or as a single batched posting -- whichever you prefer.
Order is just for narrative; nothing depends on anything else.

[A] Sizable refactor: collapse the four DEFINE walkers across
parser/planner/executor into a single phase-tagged
traversal. On that base, reject volatile / NextValueExpr
in DEFINE (the NFA may re-evaluate predicates during
backtracking; STABLE / IMMUTABLE remain accepted), and
bundle a STABLE/IMMUTABLE baseline test as a guard against
accidental over-rejection.

[B] Make the empty-match path observable in tests: replace the
stale XXX comments in rpr_nfa with the actual behavior, and
add EXPLAIN ANALYZE coverage in rpr_explain that surfaces
"NFA: N matched (len 0/0/0.0)" so the NFA-found-but-window-
empty case is regression-visible.

[C] Trim the per-advance NFA visited-bitmap reset to a high-water
mark range instead of the full bitmap. Tradeoff: two int16
comparisons added per visit, paying off for larger NFAs but
added overhead for single-word bitmaps; semantics unchanged.
I'll leave the decision to apply to your judgment.

[D] DEFINE qualifier diagnostic: tri-classify (pattern var /
range var / fall-through), reword to "expression", add
unknown-qualifier and composite-type tests, and sync the
adjacent stale comments and SELECT doc.

If any of [A]..[D] looks misjudged or you'd prefer a different
slicing, I'll reshape before posting.

Best,
Henson

In response to

Re: Row pattern recognition at 2026-05-05 00:01:24 from Tatsuo Ishii

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2026-05-05 04:57:47	Re: Include schema-qualified names in publication error messages.
Previous Message	shveta malik	2026-05-05 04:06:55	Re: Proposal: Conflict log history table for Logical Replication