Re: Window Function "Run Conditions"

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Window Function "Run Conditions"
Date: 2022-03-22 23:54:08
Message-ID: CALNJ-vTG+SDaCs_u0G9TDUngT7zVGcPxaYAJf+cjbbh8G_3GXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 22, 2022 at 3:24 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Thu, 26 Aug 2021 at 14:54, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> >
> > On Thu, Aug 19, 2021 at 2:35 PM David Rowley <dgrowleyml(at)gmail(dot)com>
> wrote:
> > >
> > > On Thu, 19 Aug 2021 at 00:20, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
> wrote:
> > > > In the current master, the result is:
> > > >
> > > > empno | salary | c | dr
> > > > -------+--------+---+----
> > > > 8 | 6000 | 4 | 1
> > >
> > > > In the patched version, the result is:
> > > >
> > > > empno | salary | c | dr
> > > > -------+--------+---+----
> > > > 8 | 6000 | 1 | 1
> > >
> > > Thanks for taking it for a spin.
> > >
> > > That's a bit unfortunate. I don't immediately see how to fix it other
> > > than to restrict the optimisation to only apply when there's a single
> > > WindowClause. It might be possible to relax it further and only apply
> > > if it's the final window clause to be evaluated, but in those cases,
> > > the savings are likely to be much less anyway as some previous
> > > WindowAgg will have exhausted all rows from its subplan.
> >
> > I am trying to hack the select_active_windows function to make
> > sure the WindowClause with Run Condition clause to be the last one
> > to evaluate (we also need to consider more than 1 window func has
> > run condition), at that time, the run condition clause is ready already.
> >
> > However there are two troubles in this direction: a). This may conflict
> > with "the windows that need the same sorting are adjacent in the list."
> > b). "when two or more windows are order-equivalent then all peer rows
> > must be presented in the same order in all of them. .. (See General Rule
> 4 of
> > <window clause> in SQL2008 - SQL2016.)"
> >
> > In summary, I am not sure if it is correct to change the execution Order
> > of WindowAgg freely.
>
> Thanks for looking at that.
>
> My current thoughts are that it just feels a little too risky to
> adjust the comparison function that sorts the window clauses to pay
> attention to the run-condition.
>
> We would need to ensure that there's just a single window function
> with a run condition as it wouldn't be valid for there to be multiple.
> It would be easy enough to ensure we only push quals into just 1
> window clause, but that and meddling with the evaluation order has
> trade-offs. To do that properly, we'd likely want to consider the
> costs when deciding which window clause would benefit from having
> quals pushed the most. Plus, if we start messing with the evaluation
> order then we'd likely really want some sort of costing to check if
> pushing a qual and adjusting the evaluation order is actually cheaper
> than not pushing the qual and keeping the clauses in the order that
> requires the minimum number of sorts. The planner is not really
> geared up for costing things like that properly, that's why we just
> assume the order with the least sorts is best. In reality that's often
> not going to be true as an index may exist and we might want to
> evaluate a clause first if we could get rid of a sort and index scan
> instead. Sorting the window clauses based on their SortGroupClause is
> just the best we can do for now at that stage in planning.
>
> I think it's safer to just disable the optimisation when there are
> multiple window clauses. Multiple matching clauses are merged
> already, so it's perfectly valid to have multiple window functions,
> it's just they must share the same window clause. I don't think
> that's terrible as with the major use case that I have in mind for
> this, the window function is only added to limit the number of rows.
> In most cases I can imagine, there'd be no reason to have an
> additional window function with different frame options.
>
> I've attached an updated patch.
>
Hi,
The following code seems to be common between if / else blocks (w.r.t.
wfunc_left) of find_window_run_conditions().

+ op = get_opfamily_member(opinfo->opfamily_id,
+ opinfo->oplefttype,
+ opinfo->oprighttype,
+ newstrategy);
+
+ newopexpr = (OpExpr *) make_opclause(op,
+ opexpr->opresulttype,
+ opexpr->opretset,
+ otherexpr,
+ (Expr *) wfunc,
+ opexpr->opcollid,
+ opexpr->inputcollid);
+ newopexpr->opfuncid = get_opcode(op);
+
+ *keep_original = true;
+ runopexpr = newopexpr;

It would be nice if this code can be shared.

+ WindowClause *wclause = (WindowClause *)
+ list_nth(subquery->windowClause,
+ wfunc->winref - 1);

The code would be more readable if list_nth() is indented.

+ /* Check the left side of the OpExpr */

It seems the code for checking left / right is the same. It would be better
to extract and reuse the code.

Cheers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-03-22 23:56:56 Re: SQL/JSON: functions
Previous Message Andrew Dunstan 2022-03-22 23:51:20 Re: New Object Access Type hooks