| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | assam258(at)gmail(dot)com |
| Cc: | sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-03-07 03:01:51 |
| Message-ID: | 20260307.120151.1477244845022229828.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Henson,
> Hi, Tatsuo
>
> Does "a zero-length match" mean "an empty match"?
>>
>
> Yes, they refer to the same thing. "Zero-length match" is the more
> common term in general regex implementations (PCRE2, Perl, Python,
> Java, etc.[1]), but the RPR standard (ISO/IEC 19075-5, Section 4.12.2)
> uses "empty match" exclusively.
>
> [1] https://www.regular-expressions.info/zerolength.html
I found Trino uses "empty match" too [2]. So for SQL users, I guess
"empty match" is more familiar wording.
> Yes, we should follow master's convention. I see three options:
>
> (a) Reorder within nodeWindowAgg.c: move the nfa_* functions up and
> keep the "API exposed to window functions" section at the bottom,
> matching master's layout.
>
> (b) Separate file under src/backend/executor/, keeping it close to
> nodeWindowAgg.c while making the boundary explicit.
>
> (c) A dedicated src/backend/rpr/ directory modeled on
> src/backend/regex/, giving the NFA engine its own namespace.
> This could also be an opportunity to consolidate the existing
> src/backend/optimizer/plan/rpr.c into the same directory.
>
> For now (a) is the safest change. Longer term, (b) or (c) would make
> more sense -- especially when we extend to MATCH_RECOGNIZE (R010),
> where the NFA engine will need to be shared across both code paths.
> Either way, the NFA engine can be exposed via a header so that R010
> can share it without further restructuring.
>
> Since the NFA algorithm is not familiar territory for most DBMS
> developers, it would also be worth preserving the detailed algorithm
> description posted earlier in this thread -- either as structured
> comments or as a dedicated README alongside the code.
>
> What do you think? Should we start with (a) now and revisit the
> broader restructuring approaches -- (b) or (c) -- later, or would you
> prefer to discuss them first? Either of those would also resolve the
> file layout convention issue naturally, since new files would follow
> proper conventions from the start.
I prefer (a) or (b) for now, at least for the first commit. The reason
is, current nfa functions take a WindowAggState argument. If we prefer
(c), I think we need to change some of (or most of) nfa functions so
that they do not take the WindowAggState argument. What do you think?
> One more thing: there are no ECPG example programs or regression tests
> for RPR yet. I'd like to propose adding them. Shall I draft an
> initial set, or would you prefer to coordinate with the ECPG
> maintainers first?
I am not familiar with ECPG. Do you know if ECPG has Window clause
tests? If ECPG does not have any Window clause tests, is it worth to
add RPR tests to ECPG?
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Corey Huinker | 2026-03-07 06:17:39 | Re: Add expressions to pg_restore_extended_stats() |
| Previous Message | Amit Kapila | 2026-03-07 01:11:46 | Re: Skipping schema changes in publication |