Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
To: vik(at)postgresfriends(dot)org
Cc: jchampion(at)timescale(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2023-07-28 11:02:30
Message-ID: 20230728.200230.1800661389729335298.t-ishii@sranhm.sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> Attached is the v3 patch. In this patch following changes are made.
>
> Excellent. Thanks!

You are welcome!

> A few quick comments:
>
> - PERMUTE is still misspelled as PREMUTE

Oops. Will fix.

> - PATTERN variables do not have to exist in the DEFINE clause. They are
> - considered TRUE if not present.

Do you think we really need this? I found a criticism regarding this.

https://link.springer.com/article/10.1007/s13222-022-00404-3
"3.2 Explicit Definition of All Row Pattern Variables"

What do you think?

>> - I am working on making window aggregates RPR aware now. The
>> implementation is in progress and far from completeness. An example
>> is below. I think row 2, 3, 4 of "count" column should be NULL
>> instead of 3, 2, 0, 0. Same thing can be said to other
>> rows. Probably this is an effect of moving aggregate but I still
>> studying the window aggregation code.
>
> This tells me again that RPR is not being run in the right place. All
> windowed aggregates and frame-level window functions should Just Work
> with no modification.

I am not touching each aggregate function. I am modifying
eval_windowaggregates() in nodeWindowAgg.c, which calls each aggregate
function. Do you think it's not the right place to make window
aggregates RPR aware?

>> SELECT company, tdate, first_value(price) OVER W, count(*) OVER w FROM
>> stock
>> WINDOW w AS (
>> PARTITION BY company
>> ORDER BY tdate
>> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
>> AFTER MATCH SKIP PAST LAST ROW
>> INITIAL
>> PATTERN (START UP+ DOWN+)
>> DEFINE
>> START AS TRUE,
>> UP AS price > PREV(price),
>> DOWN AS price < PREV(price)
>> );
>> company | tdate | first_value | count
>> ----------+------------+-------------+-------
>> company1 | 2023-07-01 | 100 | 4
>> company1 | 2023-07-02 | | 3
>> company1 | 2023-07-03 | | 2
>> company1 | 2023-07-04 | | 0
>> company1 | 2023-07-05 | | 0
>> company1 | 2023-07-06 | 90 | 4
>> company1 | 2023-07-07 | | 3
>> company1 | 2023-07-08 | | 2
>> company1 | 2023-07-09 | | 0
>> company1 | 2023-07-10 | | 0
>> company2 | 2023-07-01 | 50 | 4
>> company2 | 2023-07-02 | | 3
>> company2 | 2023-07-03 | | 2
>> company2 | 2023-07-04 | | 0
>> company2 | 2023-07-05 | | 0
>> company2 | 2023-07-06 | 60 | 4
>> company2 | 2023-07-07 | | 3
>> company2 | 2023-07-08 | | 2
>> company2 | 2023-07-09 | | 0
>> company2 | 2023-07-10 | | 0
>
> In this scenario, row 1's frame is the first 5 rows and specified SKIP
> PAST LAST ROW, so rows 2-5 don't have *any* frame (because they are
> skipped) and the result of the outer count should be 0 for all of them
> because there are no rows in the frame.

Ok. Just I want to make sure. If it's other aggregates like sum or
avg, the result of the outer aggregates should be NULL.

> When we get to adding count in the MEASURES clause, there will be a
> difference between no match and empty match, but that does not apply
> here.

Can you elaborate more? I understand that "no match" and "empty match"
are different things. But I do not understand how the difference
affects the result of count.

>> I am going to add this thread to CommitFest and plan to add both of
>> you as reviewers. Thanks in advance.
>
> My pleasure. Thank you for working on this difficult feature.

Thank you for accepting being registered as a reviewer. Your comments
are really helpful.
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2023-07-28 11:11:47 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Alvaro Herrera 2023-07-28 10:47:44 Re: cataloguing NOT NULL constraints