Re: Bug in row_number() optimization

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Sergey Shinderuk <s(dot)shinderuk(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <drowley(at)postgresql(dot)org>
Subject: Re: Bug in row_number() optimization
Date: 2022-11-25 12:46:11
Message-ID: CAMbWs49DaFZViFVD-SwdWxk3JrWvyRq9j+_oopiAZKxM0eEZCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 25, 2022 at 11:26 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> There are two different pass-through modes that the WindowAgg can move
> into when it detects that the run condition is no longer true:
>
> 1) WINDOWAGG_PASSTHROUGH and
> 2) WINDOWAGG_PASSTHROUGH_STRICT
>
> #2 is used when the WindowAgg is the top-level one in this query
> level. Remember we'll need multiple WindowAgg nodes when there are
> multiple different windows to evaluate. The reason that we need #1 is
> that if there are multiple WindowAggs, then the top-level one (or just
> any WindowAgg above it) might need all the rows from a lower-level
> WindowAgg.

Thanks for the explanation! I think now I understand pass-through modes
much better.

> What I mean by "WindowAggs above us cannot reference the result of
> another WindowAgg" is that the evaluation of sum(qty) over (order by
> date) can't see the "rn" column. SQL does not allow it.

I think I get your point. Yeah, the 'rn' column is not needed for the
evaluation of WindowAggs above. But ISTM it might be needed to evaluate
the quals of WindowAggs above. Such as in the plan below

explain (costs off) SELECT * FROM
(SELECT
count(salary) OVER (PARTITION BY depname || '') c1, -- w1
row_number() OVER (PARTITION BY depname) rn -- w2
FROM empsalary
) e WHERE rn <= 1;
QUERY PLAN
-------------------------------------------------------------------
Subquery Scan on e
-> WindowAgg
Filter: ((row_number() OVER (?)) <= 1)
-> Sort
Sort Key: (((empsalary.depname)::text || ''::text))
-> WindowAgg
Run Condition: (row_number() OVER (?) <= 1)
-> Sort
Sort Key: empsalary.depname
-> Seq Scan on empsalary
(10 rows)

The 'rn' column is calculated in the lower-level WindowAgg, and it is
needed to evaluate the 'Filter' of the upper-level WindowAgg. In
pass-through mode, the lower-level WindowAgg would not be evaluated any
more, so we need to mark the 'rn' column to something that can false the
'Filter'. Considering the 'Filter' is a strict function, marking it as
NULL would do. I think this is why this patch works.

> Just thinking of the patch a bit more, what I wrote ends up
> continually zeroing the values and marking the columns as NULL. Likely
> we can just do this once when we do: winstate->status =
> WINDOWAGG_PASSTHROUGH;

Yes, I also think we can do this only once when we go into pass-through
mode.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2022-11-25 13:39:59 Re: Bug in MERGE's test for tables with rules
Previous Message Peter Eisentraut 2022-11-25 12:37:46 Re: xml2: add test for coverage