Re: Is postgres able to share sorts required by common partition window functions?

From: Sebastien Arod <sebastien(dot)arod(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Michael Lewis <mlewis(at)entrata(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Is postgres able to share sorts required by common partition window functions?
Date: 2020-07-07 15:01:20
Message-ID: CADd42iGnivBasQkqDAgxunKJRBQ-F=7w-TV2i_JfdCb0bmdigg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Michael, David thanks for your quick replies.

*(at)Michael*
I initially dismissed writing this query using joins or subselects because
the real query has about 80 columns and I was afraid that having 80
joins/subselect would cause issues with postgresql including planner that
would fallback to GEQO.
I'll test it anyway with real data and see how it behaves.

*(at)David About Subsequent sorts*
From my experiments the time of subsequent sorts is not reduced by previous
sort but is in fact higher.
In the following example in Query 1 Sort node on key c1, c2 has a time of
~1893 excluding child nodes
In the following example in Query 2

- Sort node on key c1 has a time of ~1558 excluding child nodes
- Sort node on key c1, c2 has a time of ~2964 excluding child nodes
which is higher than the time for the same sort key in Query 1 but in Query
2 data postgresql could have benefited from a preliminary sort on c1.

I can't figure why it would be slower.

*Query1*
explain analyze select
first_value(c2) OVER (PARTITION BY c1 order by c2)
from t;

*Explain Plan Query1*
WindowAgg (cost=135042.46..155042.48 rows=1000001 width=47) (actual
time=1555.005..2639.659 rows=1000001 loops=1)
-> Sort (cost=135042.46..137542.46 rows=1000001 width=15) (actual
time=1554.989..2070.462 rows=1000001 loops=1)
Sort Key: c1, c2
Sort Method: external merge Disk: 25960kB
-> Seq Scan on t (cost=0.00..18294.01 rows=1000001 width=15)
(actual time=0.013..177.037 rows=1000001 loops=1)
Planning Time: 0.115 ms
Execution Time: 2693.935 ms

*Query2*
explain analyze select
first_value(c2) OVER (PARTITION BY c1),
first_value(c2) OVER (PARTITION BY c1 order by c2)
from t;
*Explain Plan Query2*
WindowAgg (cost=313730.43..333730.45 rows=1000001 width=79) (actual
time=4692.257..5702.226 rows=1000001 loops=1)
-> Sort (cost=313730.43..316230.43 rows=1000001 width=47) (actual
time=4692.152..5174.928 rows=1000001 loops=1)
Sort Key: c1, c2
Sort Method: external merge Disk: 32688kB
-> WindowAgg (cost=135042.46..152542.48 rows=1000001 width=47)
(actual time=1188.109..2210.845 rows=1000001 loops=1)
-> Sort (cost=135042.46..137542.46 rows=1000001 width=15)
(actual time=1188.099..1709.253 rows=1000001 loops=1)
Sort Key: c1
Sort Method: external merge Disk: 25960kB
-> Seq Scan on t (cost=0.00..18294.01 rows=1000001
width=15) (actual time=0.014..150.435 rows=1000001 loops=1)
Planning Time: 0.059 ms
Execution Time: 5756.591 ms

*(at)David About using an index*
You are correct that a simple index cannot be leveraged efficiently.
However a covering index including all columns and starting with the
partition column c1 could be used to avoid sorting on c1 right?

But in my tests it is not used. In the following example covering_index
could be used to avoid filtering on c1 but it doesn't:
*Covering Index*
create index covering_index on t (c1,c2, c3,c4);

*Query*
explain analyze select
c1,
first_value(c2) OVER (PARTITION BY c1 order by c2, c4) AS c2,
first_value(c3) OVER (PARTITION BY c1 order by coalesce(c4, '000'), c3)
AS c3
from
t;

*Plan*
WindowAgg (cost=417853.93..440353.95 rows=1000001 width=123) (actual
time=3071.980..4184.030 rows=1000001 loops=1)
-> Sort (cost=417853.93..420353.93 rows=1000001 width=91) (actual
time=3071.962..3575.641 rows=1000001 loops=1)
Sort Key: c1, (COALESCE(c4, '000'::character varying)), c3
Sort Method: external merge Disk: 52912kB
-> WindowAgg (cost=193152.96..215652.98 rows=1000001 width=91)
(actual time=1268.373..2271.463 rows=1000001 loops=1)
-> Sort (cost=193152.96..195652.96 rows=1000001 width=59)
(actual time=1268.357..1711.526 rows=1000001 loops=1)
Sort Key: c1, c2, c4
Sort Method: external merge Disk: 46168kB
-> Seq Scan on t (cost=0.00..18294.01 rows=1000001
width=59) (actual time=0.014..163.347 rows=1000001 loops=1)
Planning Time: 0.081 ms
Execution Time: 4250.367 ms

However the index is used if one of the partition by + order by matches
entirely the beginning of the index. For instance query:
select
c1,
first_value(c2) OVER (PARTITION BY c1 order by c2) AS c2,
first_value(c3) OVER (PARTITION BY c1 order by coalesce(c4, '000'), c3)
AS c3
from
t;

Uses covering_index to avoid the sort on c1, c2

On Tue, Jul 7, 2020 at 12:48 AM David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Monday, July 6, 2020, Sebastien Arod <sebastien(dot)arod(at)gmail(dot)com> wrote:
>
>> I would have expected postgresql to "share" a preliminary sort on c1 that
>> would then be useful to reduce the work on all window functions but it
>> doesn't.
>>
>
> The plan shown does share - the output of one sort goes into another.
> Subsequent sorts still have to happen but they should be faster as the
> first field is already grouped. Doesn’t change the plan though.
>
>
>> I even created an index on c1 hoping that postgresql would be able to use
>> it in order to minimize the cost of the sorts but I couldn't make it use it.
>>
>
> Use it how? You are still evaluating 250k groups so doing anything
> piece-wise seems like an expected loss compared to sequential scan.
>
> David J.
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2020-07-07 15:03:38 Re: Basic question about structuring SQL
Previous Message Laurenz Albe 2020-07-07 14:41:04 Re: Basic question about structuring SQL