From: | Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org> |
Subject: | Re: Todo: Teach planner to evaluate multiple windows in the optimal order |
Date: | 2023-01-18 17:27:09 |
Message-ID: | 8599211e-1ea9-8827-4b9b-adeb865656bd@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On 18/01/23 15:12, David Rowley wrote:
> I also thought I'd better test that foreach_delete_current() works
> with foreach_reverse(). I can confirm that it *does not* work
> correctly. I guess maybe you only tested the fact that it deleted the
> current item and not that the subsequent loop correctly went to the
> item directly before the deleted item. There's a problem with that. We
> skip an item.
Hmm, not really sure why did I miss that. I tried this again (added
following in postgres.c above
PortalStart)
List* l = NIL;
l = lappend(l, 1);
l = lappend(l, 2);
l = lappend(l, 3);
l = lappend(l, 4);
ListCell *lc;
foreach_reverse(lc, l)
{
if (foreach_current_index(lc) == 2) // delete 3
{
foreach_delete_current(l, lc);
}
}
foreach(lc, l)
{
int i = (int) lfirst(lc);
ereport(LOG,(errmsg("%d", i)));
}
Got result:
2023-01-18 20:23:28.115 IST [51007] LOG: 1
2023-01-18 20:23:28.115 IST [51007] STATEMENT: select pg_backend_pid();
2023-01-18 20:23:28.115 IST [51007] LOG: 2
2023-01-18 20:23:28.115 IST [51007] STATEMENT: select pg_backend_pid();
2023-01-18 20:23:28.115 IST [51007] LOG: 4
2023-01-18 20:23:28.115 IST [51007] STATEMENT: select pg_backend_pid();
I had expected list_delete_cell to take care of rest.
> Instead of fixing that, I think it's likely better just to loop
> backwards manually with a for() loop, so I've adjusted the patch to
> work that way. It's quite likely that the additional code in
> foreach() and what was in foreach_reverse() is slower than looping
> manually due to the additional work those macros do to set the cell to
> NULL when we run out of cells to loop over.
Okay, current version looks fine as well.
> I made another pass over the v7 patch and fixed a bug that was
> disabling the optimization when the deepest WindowAgg had a
> runCondition. This should have been using llast_node instead of
> linitial_node. The top-level WindowAgg is the last in the list.
> I also made a pass over the tests and added a test case for the
> runCondition check to make sure we disable the optimization when the
> top-level WindowAgg has one of those.
> I wasn't sure what your test comments case numbers were meant to represent.
> They were not aligned with the code comments that document when the optimisation is
> disabled, they started out aligned, but seemed to go off the rails at
> #3. I've now made them follow the comments in create_one_window_path()
> and made it more clear what we expect the test outcome to be in each
> case.
Those were just numbering for exceptional cases, making them in sync
with comments
wasn't really on my mind, but now they looks better.
> I've attached the v9 patch. I feel like this patch is quite
> self-contained and I'm quite happy with it now. If there are no
> objections soon, I'm planning on pushing it.
Patch is already rebased with latest master, tests are all green.
Tried some basic profiling and it looked good.
I also tried a bit unrealistic case.
create table abcdefgh(a int, b int, c int, d int, e int, f int, g int, h int);
insert into abcdefgh select a,b,c,d,e,f,g,h from
generate_series(1,7) a,
generate_series(1,7) b,
generate_series(1,7) c,
generate_series(1,7) d,
generate_series(1,7) e,
generate_series(1,7) f,
generate_series(1,7) g,
generate_series(1,7) h;
explain analyze select count(*) over (order by a),
row_number() over (partition by a order by b) from abcdefgh order by a,b,c,d,e,f,g,h;
In patch version
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
---------------
WindowAgg (cost=1023241.14..1225007.67 rows=5764758 width=48) (actual
time=64957.894..81950.352 rows=5764801 loops=1)
-> WindowAgg (cost=1023241.14..1138536.30 rows=5764758 width=40)
(actual time=37959.055..60391.799 rows=5764801 loops
=1)
-> Sort (cost=1023241.14..1037653.03 rows=5764758 width=32)
(actual time=37959.045..52968.791 rows=5764801 loop
s=1)
Sort Key: a, b, c, d, e, f, g, h
Sort Method: external merge Disk: 237016kB
-> Seq Scan on abcdefgh (cost=0.00..100036.58
rows=5764758 width=32) (actual time=0.857..1341.107 rows=57
64801 loops=1)
Planning Time: 0.168 ms
Execution Time: 82748.789 ms
(8 rows)
In Master
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
---------------------
Incremental Sort (cost=1040889.72..1960081.97 rows=5764758 width=48)
(actual time=23461.815..69654.700 rows=5764801 loop
s=1)
Sort Key: a, b, c, d, e, f, g, h
Presorted Key: a, b
Full-sort Groups: 49 Sort Method: quicksort Average Memory: 30kB
Peak Memory: 30kB
Pre-sorted Groups: 49 Sort Method: external merge Average Disk:
6688kB Peak Disk: 6688kB
-> WindowAgg (cost=1023241.14..1225007.67 rows=5764758 width=48)
(actual time=22729.171..40189.407 rows=5764801 loops
=1)
-> WindowAgg (cost=1023241.14..1138536.30 rows=5764758
width=40) (actual time=8726.562..18268.663 rows=5764801
loops=1)
-> Sort (cost=1023241.14..1037653.03 rows=5764758
width=32) (actual time=8726.551..11291.494 rows=5764801
loops=1)
Sort Key: a, b
Sort Method: external merge Disk: 237016kB
-> Seq Scan on abcdefgh (cost=0.00..100036.58
rows=5764758 width=32) (actual time=0.029..1600.042 r
ows=5764801 loops=1)
Planning Time: 2.742 ms
Execution Time: 71172.586 ms
(13 rows)
Patch version is approx 11 sec slower.
Patch version sort took around 15 sec, whereas master sort took ~2 + 46
= around 48 sec
BUT somehow master version is faster as Window function took 10 sec less
time.
Maybe I am interpreting it wrong but still wanted to bring this to notice.
Thanks,
Ankit
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2023-01-18 17:34:47 | Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences |
Previous Message | Tomas Vondra | 2023-01-18 17:25:15 | Re: Implement missing join selectivity estimation for range types |