Re: Add proper planner support for ORDER BY / DISTINCT aggregates

From: Pavel Luzanov <p(dot)luzanov(at)postgrespro(dot)ru>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date: 2022-11-08 06:39:12
Message-ID: 0375ee11-0a96-2299-99b6-1154a65d79c4@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 08.11.2022 04:31, David Rowley wrote:
> I've been playing around with the attached patch which does:
>
> 1. Adjusts add_paths_to_grouping_rel so that we don't add a Sort path
> when we can add an Incremental Sort path instead. This removes quite a
> few redundant lines of code.
> 2. Removes the * 1.5 fuzz-factor in cost_incremental_sort()
> 3. Does various other code tidy stuff in cost_incremental_sort().
> 4. Removes the test from incremental_sort.sql that was ensuring the
> inferior Sort -> Sort plan was being used instead of the superior Sort
> -> Incremental Sort plan.

I can confirm that with this patch, the plan with incremental sorting
beats the others.

Here are the test results with my previous example.

Script:

create table t (a text, b text, c text);
insert into t (a,b,c) select x,y,x from generate_series(1,100) as x,
generate_series(1,10000) y;
create index on t (a);
vacuum analyze t;
reset all;

explain (settings, analyze)
select a, array_agg(c order by c) from t group by a;

\echo set enable_incremental_sort=off;
set enable_incremental_sort=off;

explain (settings, analyze)
select a, array_agg(c order by c) from t group by a;

\echo set enable_seqscan=off;
set enable_seqscan=off;

explain (settings, analyze)
select a, array_agg(c order by c) from t group by a;

Script output:

CREATE TABLE
INSERT 0 1000000
CREATE INDEX
VACUUM
RESET
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=957.60..113221.24 rows=100 width=34) (actual
time=6.088..381.777 rows=100 loops=1)
   Group Key: a
   ->  Incremental Sort  (cost=957.60..108219.99 rows=1000000 width=4)
(actual time=2.387..272.332 rows=1000000 loops=1)
         Sort Key: a, c
         Presorted Key: a
         Full-sort Groups: 100  Sort Method: quicksort  Average Memory:
27kB  Peak Memory: 27kB
         Pre-sorted Groups: 100  Sort Method: quicksort  Average
Memory: 697kB  Peak Memory: 697kB
         ->  Index Scan using t_a_idx on t (cost=0.42..29279.42
rows=1000000 width=4) (actual time=0.024..128.083 rows=1000000 loops=1)
 Planning Time: 0.070 ms
 Execution Time: 381.815 ms
(10 rows)

set enable_incremental_sort=off;
SET
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=128728.34..136229.59 rows=100 width=34) (actual
time=234.044..495.537 rows=100 loops=1)
   Group Key: a
   ->  Sort  (cost=128728.34..131228.34 rows=1000000 width=4) (actual
time=231.172..383.393 rows=1000000 loops=1)
         Sort Key: a, c
         Sort Method: external merge  Disk: 15600kB
         ->  Seq Scan on t  (cost=0.00..15396.00 rows=1000000 width=4)
(actual time=0.005..78.189 rows=1000000 loops=1)
 Settings: enable_incremental_sort = 'off'
 Planning Time: 0.041 ms
 Execution Time: 497.230 ms
(9 rows)

set enable_seqscan=off;
SET
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=142611.77..150113.02 rows=100 width=34) (actual
time=262.250..527.260 rows=100 loops=1)
   Group Key: a
   ->  Sort  (cost=142611.77..145111.77 rows=1000000 width=4) (actual
time=259.551..417.154 rows=1000000 loops=1)
         Sort Key: a, c
         Sort Method: external merge  Disk: 15560kB
         ->  Index Scan using t_a_idx on t (cost=0.42..29279.42
rows=1000000 width=4) (actual time=0.012..121.995 rows=1000000 loops=1)
 Settings: enable_incremental_sort = 'off', enable_seqscan = 'off'
 Planning Time: 0.041 ms
 Execution Time: 528.950 ms
(9 rows)

--
Pavel Luzanov
Postgres Professional: https://postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2022-11-08 06:51:00 Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Previous Message Ian Lawrence Barwick 2022-11-08 06:38:22 Re: [patch] Have psql's \d+ indicate foreign partitions