Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-05-29 09:41:52
Message-ID: CAAAe_zCbSU=dd-4qTL2QaBQwQ-cf51N_851a9Y5rOoz0wj0aXw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jian, Tatsuo,

Jian, your review keeps paying off beyond the items themselves. Working
through the ExecRPRProcessRow refactor you proposed -- dropping
hasLimitedFrame and fixing the "endOffsetValue != 0" Datum-vs-0
comparison -- and writing the offset-0 regression test it calls for sent
me back to the standard, where I hit a conformance reading I'd value both
of your judgments on.

The case is the single-row full window frame. It has two spellings,
semantically identical -- both make the full window frame exactly the
current row:

ROWS BETWEEN CURRENT ROW AND CURRENT ROW
ROWS BETWEEN CURRENT ROW AND 0 FOLLOWING

PostgreSQL accepts both today (rpr_base tests the 0 FOLLOWING form with
PATTERN (A)). But I'm no longer sure either is sanctioned by Subclause
6.10.2, "ROWS BETWEEN CURRENT ROW AND", which reads (paraphrasing
ISO/IEC 19075-5):

"When performing row pattern recognition in a window, only two options
are allowed for specifying the window frame extent:
- ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- ROWS BETWEEN CURRENT ROW AND offset FOLLOWING ... [the offset]
shall be a positive integer ..."

Two things put the single-row frame outside a literal reading, and they
hit both spellings equally:

- CURRENT ROW as the *end* bound is not one of the two listed options;
and

- the offset form requires a *positive integer*, and 0 is not positive
-- so AND 0 FOLLOWING is no better sanctioned than AND CURRENT ROW.

The two spellings therefore stand or fall together: they denote the same
one-row frame, and neither is literally among 6.10.2's options.
(README.rpr X-4 already follows the strict line -- it lists only
"UNBOUNDED FOLLOWING or n FOLLOWING".)

Either way the behavior is unambiguous: a single-row search space, so the
pattern can only ever map the current row. That makes it a degenerate,
WHERE-like use rather than real row-pattern matching -- which one might
read as benign (so allow) or as pointless (so reject), so it doesn't
settle the question for me.

This is, in effect, a borderline, harmless standard violation: it sits
just outside 6.10.2 as written, yet is well-defined, runs cleanly, and
leaves every conforming frame unchanged.

So, concretely, the choice:

(a) Allow it -- accept both spellings and document the degenerate
frame as a deliberate extension; or

(b) Forbid it -- follow 6.10.2 to the letter ("only two options" and a
"positive integer" offset) and reject both spellings in
transformRPR().

Which reading do you take? Tatsuo, as co-author your call carries it;
Jian, I'd value yours too, since you were just in this code.

This isn't blocking. The implementation accepts offset 0 today, so for
now my regression test exercises it (via the 0 FOLLOWING spelling, as
rpr_base already does); if the answer is (b), that test and the existing
one become error tests together.

Thanks,
Henson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2026-05-29 09:51:29 Re: Heads Up: cirrus-ci is shutting down June 1st
Previous Message Dilip Kumar 2026-05-29 09:22:58 Re: Proposal: Conflict log history table for Logical Replication