Re: Todo: Teach planner to evaluate multiple windows in the optimal order

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: Vik Fearing <vik(at)postgresfriends(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date: 2023-01-04 15:30:50
Message-ID: 9620d994-b89a-2dcf-fca5-821e19d56858@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 05/01/23 07:48, Vik Fearing wrote:
> On 1/4/23 13:07, Ankit Kumar Pandey wrote:
>> Also, one thing, consider the following query:
>>
>> explain analyze select row_number() over (order by a,b),count(*) over
>> (order by a) from abcd order by a,b,c;
>>
>> In this case, sorting is done on (a,b) followed by incremental sort
>> on c at final stage.
>>
>> If we do just one sort: a,b,c at first stage then there won't be need
>> to do another sort (incremental one).
>
>
> This could give incorrect results.  Consider the following query:
>
> postgres=# select a, b, c, rank() over (order by a, b)
> from (values (1, 2, 1), (1, 2, 2), (1, 2, 1)) as abcd (a, b, c)
> order by a, b, c;
>
>  a | b | c | rank
> ---+---+---+------
>  1 | 2 | 1 |    1
>  1 | 2 | 1 |    1
>  1 | 2 | 2 |    1
> (3 rows)
>
>
> If you change the window's ordering like you suggest, you get this
> different result:
>
>
> postgres=# select a, b, c, rank() over (order by a, b, c)
> from (values (1, 2, 1), (1, 2, 2), (1, 2, 1)) as abcd (a, b, c)
> order by a, b, c;
>
>  a | b | c | rank
> ---+---+---+------
>  1 | 2 | 1 |    1
>  1 | 2 | 1 |    1
>  1 | 2 | 2 |    3
> (3 rows)
>
>
We are already doing something like I mentioned.

Consider this example:

explain SELECT rank() OVER (ORDER BY a), count(*) OVER (ORDER BY a,b)
FROM abcd;
                                QUERY PLAN
--------------------------------------------------------------------------
 WindowAgg  (cost=83.80..127.55 rows=1250 width=24)
   ->  WindowAgg  (cost=83.80..108.80 rows=1250 width=16)
         ->  Sort  (cost=83.80..86.92 rows=1250 width=8)
               Sort Key: a, b
               ->  Seq Scan on abcd  (cost=0.00..19.50 rows=1250 width=8)
(5 rows)

If it is okay to do extra sort for first window function (rank) here,
why would it be

any different in case which I mentioned?

My suggestion rest on assumption that for a window function, say

rank() OVER (ORDER BY a), ordering of columns (other than column 'a')
shouldn't matter.

--
Regards,
Ankit Kumar Pandey

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-01-04 15:57:28 Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs
Previous Message Daniel Verite 2023-01-04 15:22:03 Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs