Quick Links

Re: Introduce Index Aggregate - new GROUP BY strategy

From:	Sergey Soloviev <sergey(dot)soloviev(at)tantorlabs(dot)ru>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>
Subject:	Re: Introduce Index Aggregate - new GROUP BY strategy
Date:	2025-12-09 15:26:32
Message-ID:	fb5b0be3-7d14-4365-b02c-e3314b66b8e7@tantorlabs.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi!

> Are you able to provide benchmarks
Yes, sure.

Test matrix:

- number of groups: from 100 to 1000000 increased by 10 times
- different types: int, bigint, uuid, text
- strategy: hash, group, index

For each key value there are 3 tuples with different 'j' value (for
aggregation logic).

Also, there is a test (called bigtext) for large string as a key (each string is 4kB).

To test pgbench is used. Test query looks like this:

select i, sum(j) from TBL group by 1 order by 1;

Depending on the table size duration is set from 1 to 3 minutes.
Everything in attached scripts:

- setup.sql - script to setup environment (create tables, setup GUCs).
after running this you should restart database.
NOTE: actually, for int and bigint number of groups is less
than power of 10
- run_bench.sh - shell script that runs test workload. After running
it will create files with pgbench results.
- collect_results.sh - parses output files and formats result table.
As values it shows TPS.
- show_plan.sh - small script to run EXPLAIN for each run query

Finally, I have this table:

int

| amount | HashAgg | GroupAgg | IndexAgg |
| ------------- | ------------------ | ------------------- | ------------------ |
| 100 | 3249.929602 | 3501.174072 | 3765.727121 |
| 1000 | 504.420643 | 501.465754 | 575.255906 |
| 10000 | 50.528155 | 49.312322 | 54.510261 |
| 100000 | 4.775069 | 4.317584 | 4.791735 |
| 1000000 | 0.405538 | 0.406698 | 0.321379 |

bigint

| amount | HashAgg | GroupAgg | IndexAgg |
| ------------ | -------------------| ------------------- | ------------------ |
| 100 | 3225.287886 | 3510.612641 | 3742.911726 |
| 1000 | 492.908092 | 491.530184 | 574.475159 |
| 10000 | 50.192018 | 49.555983 | 53.909437 |
| 100000 | 4.831086 | 4.430059 | 4.748821 |
| 1000000 | 0.401983 | 0.413218 | 0.318144 |

text

| amount | HashAgg | GroupAgg | IndexAgg |
| ------------ | -------------------| ------------------- | ------------------ |
| 100 | 2647.030876 | 2553.503954 | 2946.282525 |
| 1000 | 348.464373 | 286.818555 | 342.771923 |
| 10000 | 32.891834 | 24.386304 | 28.249571 |
| 100000 | 2.934513 | 1.956983 | 2.237997 |
| 1000000 | 0.249291 | 0.148780 | 0.150943 |

uuid

| amount | HashAgg | GroupAgg | IndexAgg |
| ------------ | ------------------ | ------------------- | ------------------ |
| 100 | N/A | 2282.812585 | 2432.713816 |
| 1000 | N/A | 282.637163 | 303.892131 |
| 10000 | N/A | 28.375838 | 28.924711 |
| 100000 | N/A | 2.649958 | 2.449907 |
| 1000000 | N/A | 0.255203 | 0.194414 |

bigtext

| HashAgg | GroupAgg | IndexAgg |
| -------------- | --------------- | -------------- |
| N/A | 0.035247 | 0.041120 |

NOTES: I could not make Hash + Sort plan for uuid and bigtext
test and it reproduces even on upstream without this patch.

The main observation is that on small amount of groups
Index Aggregate performs better than other strategies:

- int and bigint even up to 100K keys
- text only for 100 keys
- uuid up to 10K keys
- bigtext better than Group + Sort, but tested only on big amount
of keys (100K)

---
Sergey Soloviev

TantorLabs: https://tantorlabs.com

Attachment	Content-Type	Size
collect_results.sh	application/x-shellscript	851 bytes
run_bench.sh	application/x-shellscript	2.7 KB
setup.sql	application/sql	1.2 KB
show_plan.sh	application/x-shellscript	2.1 KB

In response to

Re: Introduce Index Aggregate - new GROUP BY strategy at 2025-12-08 23:11:46 from David Rowley

Responses

Re: Introduce Index Aggregate - new GROUP BY strategy at 2025-12-09 15:31:07 from Сергей Соловьев

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Сергей Соловьев	2025-12-09 15:31:07	Re: Introduce Index Aggregate - new GROUP BY strategy
Previous Message	Heikki Linnakangas	2025-12-09 15:23:11	Re: Add wait event for CommitDelay