Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-07 16:45:49
Message-ID: 295ebbb0-f5b4-f090-b27e-9e069ef7eae1@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 07/01/23 21:57, Ankit Kumar Pandey wrote:
> On 07/01/23 09:58, David Rowley wrote:
> >
> > The attached patch has no tests added. It's going to need some of
> > those.
>
> While writing test cases, I found that optimization do not happen for
> case #1
>
> (which is prime candidate for such operation) like
>
> EXPLAIN (COSTS OFF)
> SELECT empno,
>        depname,
>        min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
>        sum(salary) OVER (PARTITION BY depname) depsalary
> FROM empsalary
> ORDER BY depname, empno, enroll_date
>
> This happens because mutual exclusiveness of two operands (when number
> of window functions > 1) viz
>
> is_sorted and last activeWindow in the condition:
>
> ( !is_sorted && lnext(activeWindows, l) == NULL)
>
> For 2nd last window function, is_sorted is false and path keys get added.
>
> In next run (for last window function), is_sorted becomes true and whole
> optimization
>
> part is skipped.
>
> Note: Major issue that if I remove is_sorted from condition, even though
>
> path keys are added, it still do not perform optimization and works same
> as in master/unoptimized case.
>
> Perhaps adding path keys at last window function is not doing trick?
> Maybe we need to add pathkeys
>
> to all window functions which are subset of query's order by
> irrespective of being last or not?
>
>
> Case #2:
>
> For presorted columns, eg
>
> CREATE INDEX depname_idx ON empsalary(depname);
> SET enable_seqscan=0;
> EXPLAIN (COSTS OFF)
> SELECT empno,
>         min(salary) OVER (PARTITION BY depname) depminsalary
> FROM empsalary
> ORDER BY depname, empno;
>
> Is this correct plan:
>
> a)
>
>                       QUERY PLAN
> -------------------------------------------------------
>  Incremental Sort
>    Sort Key: depname, empno
>    Presorted Key: depname
>    ->  WindowAgg
>          ->  Index Scan using depname_idx on empsalary
> (5 rows)
>
> or this:
>
> b) (Akin to Optimized version)
>
>                       QUERY PLAN
> -------------------------------------------------------
>  WindowAgg
>    ->  Incremental Sort
>          Sort Key: depname, empno
>          Presorted Key: depname
>          ->  Index Scan using depname_idx on empsalary
> (5 rows)
>
> Patched version does (a) because of is_sorted condition.
>
> If we remove both is_sorted and lnext(activeWindows, l) == NULL conditions,
>
> we get correct results in these two cases.
>
>
Attached patch with test cases.

For case #2, test cases still uses (a) as expected output which I don't
think is right

and we should revisit. Other than that, only failing case is due to
issue mentioned in case #1.

Thanks

--
Regards,
Ankit Kumar Pandey

Attachment Content-Type Size
better_windowclause_sorting_dgr-v2.patch text/x-patch 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michail Nikolaev 2023-01-07 18:06:06 Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
Previous Message Ankit Kumar Pandey 2023-01-07 16:27:07 Re: Todo: Teach planner to evaluate multiple windows in the optimal order