Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-09 17:15:12
Message-ID: 4ef7f947-526b-c4ec-f7a0-8b4586e693e4@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 09/01/23 17:53, David Rowley wrote:

> We need to keep looping backwards until we find the first WindowClause
> which does not contain the pathkeys of the ORDER BY.

I found the cause of confusion, *first* WindowClause means first from

forward direction. Since we are looping from backward, I took it first

from the last.

eg:

select count(*) over (order by a), count(*) over (order by a,b),
count(*) over (order by a,b,c) from abcd order by a,b,c,d;

first window clause is count(*) over (order by a) which we are using for
order-by pushdown.

This is in sync with implementation as well.

> I couldn't quite understand why the foreach() loop's
> condition couldn't just be "if (foreach_current_index(l) ==
> sort_pushdown_idx)", but I see that if we don't check if the path is
> already correctly sorted that we could end up pushing the sort down
> onto the path that's already correctly sorted. We decided we didn't
> want to move the sort around if it does not reduce the amount of
> sorting.

Yes, this was the reason, the current patch handles this without is_sort
now, which is great.

> All the tests you added still pass. Although, I didn't
> really study the tests yet to see if everything we talked about is
> covered.

It covers general cases and exceptions. Also, I did few additional
tests. Looked good.

> It turned out the sort_pushdown_idx = foreach_current_index(l) - 1;
> break; didn't work as if all the WindowClauses have pathkeys contained
> in the order by pathkeys then we don't ever set sort_pushdown_idx. I
> adjusted it to do:

> if (pathkeys_contained_in(window_pathkeys, orderby_pathkeys))
> sort_pushdown_idx = foreach_current_index(l);
> else
> break;

Yes, that would have been problematic. I have verified this case

and on related note, I have added a test case that ensure order by pushdown

shouldn't happen if window function's order by is superset of query's
order by.

> I also fixed up the outdated comments and changed it so we only set
> orderby_pathkeys once instead of once per loop in the
> foreach_reverse() loop.

Thanks, code look a lot neater now (is_sorted is gone and handled in a
better way).

> I gave some thought to whether doing foreach_delete_current() is safe
> within a foreach_reverse() loop. I didn't try it, but I couldn't see
> any reason why not. It is pretty late here and I'd need to test that
> to be certain. If it turns out not to be safe then we need to document
> that fact in the comments of the foreach_reverse() macro and the
> foreach_delete_current() macro.

I tested foreach_delete_current inside foreach_reverse loop.

It worked fine.

I have attached patch with one extra test case (as mentioned above).
Rest of the changes are looking fine.

Ran pgbench again and optimized version still had a lead (168 tps vs 135
tps) in performance.

Do we have any pending items for this patch now?

--

Regards,
Ankit Kumar Pandey

Attachment Content-Type Size
better_windowclause_sorting_dgr-v7.patch text/x-patch 17.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-01-09 17:34:35 Re: wake up logical workers after ALTER SUBSCRIPTION
Previous Message Matthias van de Meent 2023-01-09 16:50:10 Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE