From: | Sebastien Arod <sebastien(dot)arod(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Is postgres able to share sorts required by common partition window functions? |
Date: | 2020-07-06 16:43:31 |
Message-ID: | CADd42iFZWwYNsXjEM_3HWK3QnfiCrMNmpOkZqyBQCabnVxOPtw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
I'm trying to optimize the following query on postgres 11.6 (running on
Aurora)
select distinct
c1,
first_value(c2) OVER (PARTITION BY c1 order by c2) AS c2,
first_value(c3) OVER (PARTITION BY c1 order by c3) AS c3,
first_value(c4) OVER (PARTITION BY c1 order by c4) AS c4
from
t;
From the explain plan (attached at the end of the email) I see that
postgresql is doing several sorts one per window function and one for the
distinct that seems ok.
However all the window functions being on the same partition I would have
expected postgresql to "share" a preliminary sort on c1 that would then be
useful to reduce the work on all window functions but it doesn't.
I even created an index on c1 hoping that postgresql would be able to use
it in order to minimize the cost of the sorts but I couldn't make it use it.
Is there something I am missing?
You can find below a script to set up a table and data to reproduce as well
as the explain plan.
*Setup Script*
create table t(
pk varchar(200) PRIMARY key,
c1 varchar(200),
c2 varchar(200),
c3 varchar(200),
c4 varchar(200)
);
create index i1 on t (c1);
insert into t
(pk, c1, c2, c3, c4 )
select
generate_series::text pk,
'Grp' ||(generate_series / 4)::text c1,
generate_series::text c2,
generate_series::text c3,
generate_series::text c4
from generate_series(0, 1000000);
*Explain Plan*
Unique (cost=808480.87..820980.88 rows=1000001 width=123) (actual
time=7131.675..7781.082 rows=250001 loops=1)
-> Sort (cost=808480.87..810980.87 rows=1000001 width=123) (actual
time=7131.673..7603.926 rows=1000001 loops=1)
Sort Key: c1, (first_value(c2) OVER (?)), (first_value(c3) OVER
(?)), (first_value(c4) OVER (?))
Sort Method: external merge Disk: 59640kB
-> WindowAgg (cost=558937.90..578937.92 rows=1000001 width=123)
(actual time=5179.374..6268.937 rows=1000001 loops=1)
-> Sort (cost=558937.90..561437.90 rows=1000001 width=91)
(actual time=5179.355..5679.136 rows=1000001 loops=1)
Sort Key: c1, c4
Sort Method: external merge Disk: 52912kB
-> WindowAgg (cost=336736.93..356736.95 rows=1000001
width=91) (actual time=3260.950..4389.116 rows=1000001 loops=1)
-> Sort (cost=336736.93..339236.93 rows=1000001
width=59) (actual time=3260.934..3778.385 rows=1000001 loops=1)
Sort Key: c1, c3
Sort Method: external merge Disk: 46176kB
-> WindowAgg (cost=141877.96..161877.98
rows=1000001 width=59) (actual time=1444.692..2477.284 rows=1000001 loops=1)
-> Sort (cost=141877.96..144377.96
rows=1000001 width=27) (actual time=1444.669..1906.993 rows=1000001 loops=1)
Sort Key: c1, c2
Sort Method: external merge
Disk: 39424kB
-> Seq Scan on t
(cost=0.00..18294.01 rows=1000001 width=27) (actual time=0.011..177.815
rows=1000001 loops=1)
Planning Time: 0.214 ms
Execution Time: 7839.646 ms
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Lewis | 2020-07-06 19:51:36 | Re: Apply LIMIT when computation is logically irrelevant |
Previous Message | Christophe Courtois | 2020-07-06 16:41:37 | Pgbench : vacuum default behaviour history |