Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-06 13:11:33
Message-ID: 01248493-182b-0f92-f2f2-ff28dc4b2e83@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry if multiple mails has been sent for this.

> On 05/01/23 12:53, David Rowley wrote:
>>
>> We *can* reuse Sorts where a more strict or equivalent sort order is
>> available.  The question is how do we get the final WindowClause to do
>> something slightly more strict to save having to do anything for the
>> ORDER BY.  One way you might think would be to adjust the
>> WindowClause's orderClause to add the additional clauses, but that
>> cannot be done because that would cause are_peers() in nodeWindowAgg.c
>> to not count some rows as peers when they maybe should be given a less
>> strict orderClause in the WindowClause.

I attempted this in attached patch.

#1. No op case

------------------------------------------In patched
version-----------------------------------------------

explain (costs off) SELECT rank() OVER (ORDER BY b), count(*) OVER
(ORDER BY a,b,c) FROM abcd order by a;
                QUERY PLAN
------------------------------------------
 WindowAgg
   ->  Sort
         Sort Key: a, b, c
         ->  WindowAgg
               ->  Sort
                     Sort Key: b
                     ->  Seq Scan on abcd
(7 rows)

explain (costs off) SELECT rank() OVER (ORDER BY b), count(*) OVER
(ORDER BY a,b,c) FROM abcd;
                QUERY PLAN
------------------------------------------
 WindowAgg
   ->  Sort
         Sort Key: b
         ->  WindowAgg
               ->  Sort
                     Sort Key: a, b, c
                     ->  Seq Scan on abcd
(7 rows)

----------------------------------------------In
master--------------------------------------------------------

explain (costs off) SELECT rank() OVER (ORDER BY b), count(*) OVER
(ORDER BY a,b,c) FROM abcd order by a;
                QUERY PLAN
------------------------------------------
 WindowAgg
   ->  Sort
         Sort Key: a, b, c
         ->  WindowAgg
               ->  Sort
                     Sort Key: b
                     ->  Seq Scan on abcd
(7 rows)
explain (costs off) SELECT rank() OVER (ORDER BY b), count(*) OVER
(ORDER BY a,b,c) FROM abcd;
                QUERY PLAN
------------------------------------------
 WindowAgg
   ->  Sort
         Sort Key: b
         ->  WindowAgg
               ->  Sort
                     Sort Key: a, b, c
                     ->  Seq Scan on abcd
(7 rows)

No change between patched version and master.

2. In case where optimization can happen

----------------------------In patched
version-------------------------------------------------------

explain (costs off) SELECT rank() OVER (ORDER BY b), count(*) OVER
(ORDER BY a) FROM abcd order by a,b;
                QUERY PLAN
------------------------------------------
 WindowAgg
   ->  Sort
         Sort Key: a, b
         ->  WindowAgg
               ->  Sort
                     Sort Key: b
                     ->  Seq Scan on abcd
(7 rows)

explain (costs off)  SELECT rank() OVER (ORDER BY a), count(*) OVER
(ORDER BY b), count(*) OVER (PARTITION BY a ORDER BY b) FROM abcd order
by a,b,c,d;
                   QUERY PLAN
------------------------------------------------
 WindowAgg
   ->  WindowAgg
         ->  Sort
               Sort Key: a, b, c, d
               ->  WindowAgg
                     ->  Sort
                           Sort Key: b
                           ->  Seq Scan on abcd
(8 rows)

-------------------------------------------In
master--------------------------------------------------------

explain (costs off) SELECT rank() OVER (ORDER BY b), count(*) OVER
(ORDER BY a) FROM abcd order by a,b;
                   QUERY PLAN
------------------------------------------------
 Incremental Sort
   Sort Key: a, b
   Presorted Key: a
   ->  WindowAgg
         ->  Sort
               Sort Key: a
               ->  WindowAgg
                     ->  Sort
                           Sort Key: b
                           ->  Seq Scan on abcd
(10 rows)

explain (costs off)  SELECT rank() OVER (ORDER BY a), count(*) OVER
(ORDER BY b), count(*) OVER (PARTITION BY a ORDER BY b) FROM abcd order
by a,b,c,d;
                      QUERY PLAN
------------------------------------------------------
 Incremental Sort
   Sort Key: a, b, c, d
   Presorted Key: a, b
   ->  WindowAgg
         ->  WindowAgg
               ->  Sort
                     Sort Key: a, b
                     ->  WindowAgg
                           ->  Sort
                                 Sort Key: b
                                 ->  Seq Scan on abcd
(11 rows)

Patched version removes few sorts.

Regression tests all passed so it is not breaking anything existing.

We don't have any tests for verifying sorting plan in window functions
(which would have failed, if present).

Please let me know any feedbacks (I have added some my own concerns in
the comments)

Thanks

--
Regards,
Ankit Kumar Pandey

Attachment Content-Type Size
v1-0002-Teach-planner-to-optimize-sorting-operations-for-win.patch text/x-patch 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-01-06 13:21:44 Re: Resolve UNKNOWN type to relevant type instead of text type while bulk update using values
Previous Message Peter Eisentraut 2023-01-06 12:48:49 Re: Add BufFileRead variants with short read and EOF detection