| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | assam258(at)gmail(dot)com |
| Cc: | jian(dot)universality(at)gmail(dot)com, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-06-17 13:13:27 |
| Message-ID: | 20260617.221327.809417533229490738.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Henson,
> Hi Tatsuo, Jian,
>
> I think there's a correctness problem in the RPR patch: a window function's
> result can change depending on which other, unrelated window functions are
> in the same query.
>
> Pattern matching only advances when a window function reads the frame.
> nth_value(x, n) returns NULL without reading the frame when n is NULL
> (correct per the standard), so if it is the only window function in an RPR
> window, the match never advances over those rows and the reduced frame no
> longer matches a full scan.
Ouch.
> Example -- one partition, 60 rows, price = id * 10:
>
> CREATE TABLE rpr_dormant (id int, price int);
> INSERT INTO rpr_dormant SELECT g, g*10 FROM generate_series(1,60) g;
>
> SELECT id, nth_value(price, CASE WHEN id < 50 THEN NULL ELSE 1 END) OVER w
> FROM rpr_dormant
> WINDOW w AS (
> ORDER BY id
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> AFTER MATCH SKIP PAST LAST ROW
> PATTERN (A+)
> DEFINE A AS price > PREV(FIRST(price), 50)
> );
>
> Run alone, the nth_value column does not follow the actual match structure;
> adding an unrelated first_value(id) OVER w, which reads the frame every row,
> changes it. And while the match is dormant the mark position keeps
> advancing, running ahead and trimming rows that the backward navigation
> later needs -- so the same query can instead fail with "cannot fetch row N
> before WindowObject's mark position".
>
> I think an RPR window should perform the match for each row up front,
> building its reduced frame during the row scan before the window functions
> are evaluated, regardless of whether any function reads the frame. The fix
> belongs in the executor, not in nth_value -- the early return is standard,
> and the same gap is reachable from any user-defined function that skips the
> frame.
>
> Does this direction seem right, or is the lazy, frame-driven matching
> intentional in a way I'm missing? Happy to prepare a patch.
Yes, I think the direction is correct. Probably the patch would someting like this?
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index cb6a484b7de..b6c12096c85 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2523,6 +2523,9 @@ ExecWindowAgg(PlanState *pstate)
{
if (winstate->rpSkipTo == ST_NEXT_ROW)
clear_reduced_frame(winstate);
+
+ update_reduced_frame(winstate->nav_winobj,
+ winstate->frameheadpos);
}
/*
> is the lazy, frame-driven matching
> intentional in a way I'm missing?
Not intentional.
Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ilia Evdokimov | 2026-06-17 13:18:39 | Re: Optional skipping of unchanged relations during ANALYZE? |
| Previous Message | Daniel Gustafsson | 2026-06-17 13:04:59 | Re: DataChecksumsStateStruct cost_delay fields and locking |