| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | jian he <jian(dot)universality(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Cc: | zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-06-16 05:32:58 |
| Message-ID: | CAAAe_zCQaV_im5WNbU1D1HAQDUk3yYK3nSduZZ8Qu_aK-aRA9A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Tatsuo, Jian,
I think there's a correctness problem in the RPR patch: a window function's
result can change depending on which other, unrelated window functions are
in the same query.
Pattern matching only advances when a window function reads the frame.
nth_value(x, n) returns NULL without reading the frame when n is NULL
(correct per the standard), so if it is the only window function in an RPR
window, the match never advances over those rows and the reduced frame no
longer matches a full scan.
Example -- one partition, 60 rows, price = id * 10:
CREATE TABLE rpr_dormant (id int, price int);
INSERT INTO rpr_dormant SELECT g, g*10 FROM generate_series(1,60) g;
SELECT id, nth_value(price, CASE WHEN id < 50 THEN NULL ELSE 1 END) OVER w
FROM rpr_dormant
WINDOW w AS (
ORDER BY id
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A+)
DEFINE A AS price > PREV(FIRST(price), 50)
);
Run alone, the nth_value column does not follow the actual match structure;
adding an unrelated first_value(id) OVER w, which reads the frame every row,
changes it. And while the match is dormant the mark position keeps
advancing, running ahead and trimming rows that the backward navigation
later needs -- so the same query can instead fail with "cannot fetch row N
before WindowObject's mark position".
I think an RPR window should perform the match for each row up front,
building its reduced frame during the row scan before the window functions
are evaluated, regardless of whether any function reads the frame. The fix
belongs in the executor, not in nth_value -- the early return is standard,
and the same gap is reachable from any user-defined function that skips the
frame.
Does this direction seem right, or is the lazy, frame-driven matching
intentional in a way I'm missing? Happy to prepare a patch.
Best regards,
Henson
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-06-16 05:48:29 | IGNORE/RESPECT NULLS can be specified for (prokind == 'f'). |
| Previous Message | Xuneng Zhou | 2026-06-16 05:29:35 | Re: Fix race in ReplicationSlotRelease for ephemeral slots |