Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: assam258(at)gmail(dot)com
Cc: jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-01-15 12:04:57
Message-ID: 20260115.210457.19846684278230391.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Henson,

>> So my question is, how do you ensure that ctxFrameEnd does not go
>> beyond the full window frame end?
>>
>
> In update_reduced_frame():
>
> frameOffset = endOffsetValue; // e.g., 2 from "2 FOLLOWING"
>
> for each row (currentPos):
>
> if (!rowExists) // partition end reached
> finalize all contexts;
> break;
>
> for each context:
> ctxFrameEnd = matchStartRow + frameOffset + 1;
> if (currentPos >= ctxFrameEnd)
> finalize this context;
> continue;
>
> Even if ctxFrameEnd exceeds partition end, the "if (!rowExists)" check
> fires first and finalizes all contexts at the actual partition boundary.

We need to check the *frame" end, not the partition end.

I think your patch relies on !window_gettupleslot() to check whether
the row exists.

if (!window_gettupleslot(winobj, pos, slot))
return false; /* No row exists */

But the function only checks the row existence in the current partition:

* Fetch the pos'th tuple of the current partition into the slot,

Thus it is possible that window_gettupleslot() returns true but the
row is not in the current frame in case that the partition is divided
into some frames. You need to check the row existence in a frame. For
this purpose you can use row_is_in_frame().

I ran following query and got the result with v38 (which includes your
NFA patches).

WITH data AS (
SELECT * FROM (VALUES
('A', 1), ('A', 2),
('B', 3), ('B', 4)
) AS t(gid, id))
SELECT gid, id, array_agg(id) OVER w
FROM data
WINDOW w AS (
PARTITION BY gid
ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING
AFTER MATCH SKIP TO NEXT ROW
PATTERN (A+)
DEFINE A AS id < 10
);
gid | id | array_agg
-----+----+-----------
A | 1 | {1,2}
A | 2 |
B | 3 | {3,4}
B | 4 |
(4 rows)

I think the second and 4th rows are expected to return some data in
array_agg colum. In fact v37 patch returns following results for the
same query:

gid | id | array_agg
-----+----+-----------
A | 1 | {1,2}
A | 2 | {2}
B | 3 | {3,4}
B | 4 | {4}
(4 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2026-01-15 12:18:13 Re: Flush some statistics within running transactions
Previous Message Shinya Kato 2026-01-15 11:44:24 Re: file_fdw: Support multi-line HEADER option.