Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-18 17:27:09
Message-ID: 8599211e-1ea9-8827-4b9b-adeb865656bd@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 18/01/23 15:12, David Rowley wrote:

> I also thought I'd better test that foreach_delete_current() works
> with foreach_reverse(). I can confirm that it *does not* work
> correctly. I guess maybe you only tested the fact that it deleted the
> current item and not that the subsequent loop correctly went to the
> item directly before the deleted item. There's a problem with that. We
> skip an item.

Hmm, not really sure why did I miss that. I tried this again (added
following in postgres.c above

PortalStart)

List* l = NIL;
l = lappend(l, 1);
l = lappend(l, 2);
l = lappend(l, 3);
l = lappend(l, 4);

ListCell *lc;
foreach_reverse(lc, l)
{
if (foreach_current_index(lc) == 2) // delete 3
{
foreach_delete_current(l, lc);
}
}

foreach(lc, l)
{
int i = (int) lfirst(lc);
ereport(LOG,(errmsg("%d", i)));
}

Got result:
2023-01-18 20:23:28.115 IST [51007] LOG: 1
2023-01-18 20:23:28.115 IST [51007] STATEMENT: select pg_backend_pid();
2023-01-18 20:23:28.115 IST [51007] LOG: 2
2023-01-18 20:23:28.115 IST [51007] STATEMENT: select pg_backend_pid();
2023-01-18 20:23:28.115 IST [51007] LOG: 4
2023-01-18 20:23:28.115 IST [51007] STATEMENT: select pg_backend_pid();

I had expected list_delete_cell to take care of rest.

> Instead of fixing that, I think it's likely better just to loop
> backwards manually with a for() loop, so I've adjusted the patch to
> work that way. It's quite likely that the additional code in
> foreach() and what was in foreach_reverse() is slower than looping
> manually due to the additional work those macros do to set the cell to
> NULL when we run out of cells to loop over.

Okay, current version looks fine as well.

> I made another pass over the v7 patch and fixed a bug that was
> disabling the optimization when the deepest WindowAgg had a
> runCondition. This should have been using llast_node instead of
> linitial_node. The top-level WindowAgg is the last in the list.

> I also made a pass over the tests and added a test case for the
> runCondition check to make sure we disable the optimization when the
> top-level WindowAgg has one of those.

> I wasn't sure what your test comments case numbers were meant to represent.
> They were not aligned with the code comments that document when the optimisation is
> disabled, they started out aligned, but seemed to go off the rails at
> #3. I've now made them follow the comments in create_one_window_path()
> and made it more clear what we expect the test outcome to be in each
> case.

Those were just numbering for exceptional cases, making them in sync
with comments

wasn't really on my mind, but now they looks better.

> I've attached the v9 patch. I feel like this patch is quite
> self-contained and I'm quite happy with it now. If there are no
> objections soon, I'm planning on pushing it.

Patch is already rebased with latest master, tests are all green.

Tried some basic profiling and it looked good.

I also tried a bit unrealistic case.

create table abcdefgh(a int, b int, c int, d int, e int, f int, g int, h int);

insert into abcdefgh select a,b,c,d,e,f,g,h from
generate_series(1,7) a,
generate_series(1,7) b,
generate_series(1,7) c,
generate_series(1,7) d,
generate_series(1,7) e,
generate_series(1,7) f,
generate_series(1,7) g,
generate_series(1,7) h;

explain analyze select count(*) over (order by a),
row_number() over (partition by a order by b) from abcdefgh order by a,b,c,d,e,f,g,h;

In patch version

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
---------------
 WindowAgg  (cost=1023241.14..1225007.67 rows=5764758 width=48) (actual
time=64957.894..81950.352 rows=5764801 loops=1)
   ->  WindowAgg  (cost=1023241.14..1138536.30 rows=5764758 width=40)
(actual time=37959.055..60391.799 rows=5764801 loops
=1)
         ->  Sort  (cost=1023241.14..1037653.03 rows=5764758 width=32)
(actual time=37959.045..52968.791 rows=5764801 loop
s=1)
               Sort Key: a, b, c, d, e, f, g, h
               Sort Method: external merge  Disk: 237016kB
               ->  Seq Scan on abcdefgh  (cost=0.00..100036.58
rows=5764758 width=32) (actual time=0.857..1341.107 rows=57
64801 loops=1)
 Planning Time: 0.168 ms
 Execution Time: 82748.789 ms
(8 rows)

In Master

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
---------------------
 Incremental Sort  (cost=1040889.72..1960081.97 rows=5764758 width=48)
(actual time=23461.815..69654.700 rows=5764801 loop
s=1)
   Sort Key: a, b, c, d, e, f, g, h
   Presorted Key: a, b
   Full-sort Groups: 49  Sort Method: quicksort  Average Memory: 30kB 
Peak Memory: 30kB
   Pre-sorted Groups: 49  Sort Method: external merge  Average Disk:
6688kB  Peak Disk: 6688kB
   ->  WindowAgg  (cost=1023241.14..1225007.67 rows=5764758 width=48)
(actual time=22729.171..40189.407 rows=5764801 loops
=1)
         ->  WindowAgg  (cost=1023241.14..1138536.30 rows=5764758
width=40) (actual time=8726.562..18268.663 rows=5764801
loops=1)
               ->  Sort  (cost=1023241.14..1037653.03 rows=5764758
width=32) (actual time=8726.551..11291.494 rows=5764801
 loops=1)
                     Sort Key: a, b
                     Sort Method: external merge  Disk: 237016kB
                     ->  Seq Scan on abcdefgh (cost=0.00..100036.58
rows=5764758 width=32) (actual time=0.029..1600.042 r
ows=5764801 loops=1)
 Planning Time: 2.742 ms
 Execution Time: 71172.586 ms
(13 rows)

Patch version is approx 11 sec slower.

Patch version sort took around 15 sec, whereas master sort took ~2 + 46
= around 48 sec

BUT somehow master version is faster as Window function took 10 sec less
time.

Maybe I am interpreting it wrong but still wanted to bring this to notice.

Thanks,

Ankit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-01-18 17:34:47 Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences
Previous Message Tomas Vondra 2023-01-18 17:25:15 Re: Implement missing join selectivity estimation for range types