Re: Show estimated number of groups for IncrementalSort in EXPLAIN

From: solai v <solai(dot)cdac(at)gmail(dot)com>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Show estimated number of groups for IncrementalSort in EXPLAIN
Date: 2026-06-23 12:06:31
Message-ID: CAF0whuc_+9ovdUhp0C83ZBoA93o1AFbvWC7khN==tPVSdXvYCg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

On Tue, Jun 23, 2026 at 3:22 PM Ilia Evdokimov
<ilya(dot)evdokimov(at)tantorlabs(dot)com> wrote:
>
> Hi hackers,
>
> In [0], a question was raised whether we should expose IncrementalSort group estimation in EXPLAIN, as we did for Memoize. Knowing the estimated number of groups helps understand why the planner chose IncrementalSort whether a bad estimate is to blame for a sub-optimal plan.
>
> Example of EXPLAIN:
>
> ```
> CREATE TABLE t (a int, b int, c int);
> CREATE INDEX ON t (a);
> INSERT INTO t SELECT i % 100, (random() * 1000)::int, (random() * 1000)::int FROM generate_series(1, 100000) i;
> ANALYZE t;
> SET enable_seqscan = off;
>
> EXPLAIN ANALYZE SELECT * FROM t ORDER BY a, b;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------
> Incremental Sort (cost=90.80..10302.90 rows=100000 width=12) (actual time=6.715..29.592 rows=100000.00 loops=1)
> Sort Key: a, b
> Presorted Key: a
> Estimated Groups: 100
> Full-sort Groups: 100 Sort Method: quicksort Average Memory: 27kB Peak Memory: 27kB
> Pre-sorted Groups: 100 Sort Method: quicksort Average Memory: 56kB Peak Memory: 56kB
> Buffers: shared hit=54201 dirtied=1 written=1
> -> Index Scan using t_a_idx on t (cost=0.29..4068.01 rows=100000 width=12) (actual time=0.346..20.403 rows=100000.00 loops=1)
> Index Searches: 1
> Buffers: shared hit=54201 dirtied=1 written=1
> Planning:
> Buffers: shared hit=21
> Planning Time: 0.411 ms
> Execution Time: 30.530 ms
> (14 rows)
> ```
>
> Thoughts?
>
>
> [0]: https://www.postgresql.org/message-id/6642af90-561c-4f0c-9d5b-7e288e6e7f84%40gmail.com
>

Thank you for the patch and the idea of exposing the estimated number
of groups for Incremental Sort in EXPLAIN makes sense to me. I applied
the patch on current master and verified the new output using a simple
Incremental Sort plan:

Incremental Sort (cost=86.88..9911.06 rows=100000 width=8)
Sort Key: a, b
Presorted Key: a
Estimated Groups: 100

I also verified that the field is displayed in both EXPLAIN and
EXPLAIN ANALYZE output, the field is omitted when EXPLAIN is executed
with COSTS OFF, displayed value matches the planner's estimate used
for costing, the output formatting looks correct and the new line
appears in an appropriate location between "Presorted Key" and the
execution statistics shown by EXPLAIN ANALYZE. Additionally, I tested
a parallel plan and confirmed that the estimate is displayed correctly
when Incremental Sort appears below a Gather Merge node:

Gather Merge (cost=4885.06..6358.53 rows=100000 width=8)
Workers Planned: 4
-> Sort (cost=4885.00..4947.50 rows=25000 width=8)
Sort Key: pt.a, pt.b
-> Parallel Append (cost=0.29..3058.80 rows=25000 width=8)
-> Parallel Index Scan using pt_p1_a_idx on pt_p1 pt_1
(cost=0.29..1467.02 rows=12500 width=8)
-> Parallel Index Scan using pt_p2_a_idx on pt_p2 pt_2
(cost=0.29..1466.78 rows=12500 width=8)

I also tested a partitioned table setup and verified that the estimate
is displayed correctly for Incremental Sort plans built on top of
partitioned relations:

Incremental Sort (cost=92.24..10418.69 rows=100000 width=8)
Sort Key: pt.a, pt.b
Presorted Key: pt.a
Estimated Groups: 100
-> Append (cost=0.58..4183.80 rows=100000 width=8)
-> Index Scan using pt_p1_a_idx on pt_p1 pt_1 (cost=0.29..1842.02
rows=50000 width=8)
-> Index Scan using pt_p2_a_idx on pt_p2 pt_2
(cost=0.29..1841.78 rows=50000 width=8)
(7 rows)

During testing I observed that the estimated number of groups differed
significantly which demonstrated the usefulness of exposing this
planner estimate for diagnosing cardinality estimation issues
affecting Incremental Sort costing. So far no issue with the patch and
the patch looks good to me. Looking forward to more feedback.

Regards,
Solai

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2026-06-23 12:10:52 Re: use of SPI by postgresImportForeignStatistics
Previous Message Jeevan Chalke 2026-06-23 11:56:40 Re: Add PRODUCT() aggregate function