Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-09 12:23:14
Message-ID: CAApHDvqVjeHg6UgeCvFOa-uCaEFix9C_3YaT2qxsMtjN4oKbNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 9 Jan 2023 at 21:34, Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com> wrote:
>
>
> > On 09/01/23 03:48, David Rowley wrote:
> > 3. I don't think you need to set "index" on every loop. Why not just
> set it to foreach_current_index(l) - 1; break;
>
> Consider this query
>
> EXPLAIN (COSTS OFF)
> SELECT empno,
> depname,
> min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
> sum(salary) OVER (PARTITION BY depname) depsalary,
> count(*) OVER (ORDER BY enroll_date DESC) c
> FROM empsalary
> ORDER BY depname, empno, enroll_date;
>
>
> Here, W1 = min(salary) OVER (PARTITION BY depname ORDER BY empno) W2 =
> sum(salary) OVER (PARTITION BY depname)
>
> W3 = count(*) OVER (ORDER BY enroll_date DESC)
>
> (1,2,3 are winref).
>
>
> activeWindows = [W3, W1, W2]
>
> If we iterate from reverse and break at first occurrence, we will
> break at W2 and add extra keys there, but what we want it to add
> keys at W1 so that it gets spilled to W2 (as existing logic is designed to
> carry over sorted cols first to last).

We need to keep looping backwards until we find the first WindowClause
which does not contain the pathkeys of the ORDER BY. When we find a
WindowClause that does not contain the pathkeys of the ORDER BY, then
we must set the sort_pushdown_idx to the index of the prior
WindowClause. I couldn't quite understand why the foreach() loop's
condition couldn't just be "if (foreach_current_index(l) ==
sort_pushdown_idx)", but I see that if we don't check if the path is
already correctly sorted that we could end up pushing the sort down
onto the path that's already correctly sorted. We decided we didn't
want to move the sort around if it does not reduce the amount of
sorting.

I had to try this out for myself and I've ended up with the attached
v6 patch. All the tests you added still pass. Although, I didn't
really study the tests yet to see if everything we talked about is
covered.

It turned out the sort_pushdown_idx = foreach_current_index(l) - 1;
break; didn't work as if all the WindowClauses have pathkeys contained
in the order by pathkeys then we don't ever set sort_pushdown_idx. I
adjusted it to do:

if (pathkeys_contained_in(window_pathkeys, orderby_pathkeys))
sort_pushdown_idx = foreach_current_index(l);
else
break;

I also fixed up the outdated comments and changed it so we only set
orderby_pathkeys once instead of once per loop in the
foreach_reverse() loop.

I gave some thought to whether doing foreach_delete_current() is safe
within a foreach_reverse() loop. I didn't try it, but I couldn't see
any reason why not. It is pretty late here and I'd need to test that
to be certain. If it turns out not to be safe then we need to document
that fact in the comments of the foreach_reverse() macro and the
foreach_delete_current() macro.

David

Attachment Content-Type Size
better_windowclause_sorting_dgr-v6.patch text/plain 16.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2023-01-09 12:29:05 Re: MERGE ... RETURNING
Previous Message Dean Rasheed 2023-01-09 12:03:34 Re: [PATCH] random_normal function