Re: Introduce Index Aggregate - new GROUP BY strategy

From: Sergey Soloviev <sergey(dot)soloviev(at)tantorlabs(dot)ru>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Introduce Index Aggregate - new GROUP BY strategy
Date: 2026-01-04 13:09:27
Message-ID: f41ddd0f-25ba-4b02-af6b-23a44f4164d8@tantorlabs.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi!

Sorry for late answer, I didn't notice your email.

> Here is my 'aerial' review
Yes. You are right.

> Can you benchmark the scenario where the optimiser mispredicts numGroups. If the planner underestimates group cardinality, the btree overhead could be much higher than expected. Does the approach degrade gracefully?
 I will try
> 2. Consider splitting the hash_* → spill_* field renaming into a separate preparatory commit to reduce the complexity of reviewing the core logic changes.
Will be done
> 3. I notice AGG_INDEX requires both sortable AND hashable types. While I understand this is for the hash-based spill partitioning, is this limitation necessary? Could you use sort-based spilling (similar to tuplesort's external merge) instead? This would allow AGG_INDEX to work with sortable-only types (I can imagine a geometric type with B-tree operators but no hash functions).
I think this is possible if we could use combine function. I did not think about this yet.

---

Some days ago I have implemented Ttree as internal index instead of B+tree. To my surprise, the performance degraded. There is a table with benchmark results (amount is amount of groups and value is latency in ms).

int

| amount | HashAgg | GroupAgg | IndexAgg |
| ------ | ------- | -------- | -------- |
| 100    | 0.222   | 0.199    | 0.198    |
| 1000   | 1.506   | 1.506    | 1.414    |
| 10000  | 15.414  | 15.598   | 15.891   |
| 100000 | 159.625 | 171.507  | 194.401  |

bigint

| amount | HashAgg | GroupAgg | IndexAgg |
| ------ | ------- | -------- | -------- |
| 100    | 0.220   | 0.198    | 0.196    |
| 1000   | 1.504   | 1.514    | 1.419    |
| 10000  | 15.404  | 15.717   | 15.836   |
| 100000 | 160.323 | 172.922  | 193.799  |

text

| amount | HashAgg | GroupAgg | IndexAgg |
| ------ | ------- | -------- | -------- |
| 100    | 0.280   | 0.301    | 0.287    |
| 1000   | 2.267   | 2.954    | 2.734    |
| 10000  | 24.613  | 35.383   | 35.401   |
| 100000 | 270.657 | 430.929  | 485.113  |

uuid

| amount | HashAgg | GroupAgg | IndexAgg |
| ------ | ------- | -------- | -------- |
| 100    | 0.311   | 0.317    | 0.310    |
| 1000   | 2.827   | 2.667    | 2.675    |
| 10000  | 33.233  | 26.980   | 28.848   |
| 100000 | 437.452 | 287.236  | 363.142  |

You can notice how latency increases when amount of groups reaches 100K. Probably this is because of low branching of the Ttree - unlike B+tree it has only 2 children, so have to traverse more nodes.
Also, I do not deny that the problem may be in my code, i.e. some paths are not optimized or there is a bug and tree becomes imbalanced.
I will try to implement simple Btree as another attempt (not B+tree).

The patches are in attachments.

---
Sergey Soloviev

TantorLabs: https://tantorlabs.com

Attachment Content-Type Size
v4-0005-fix-tests-for-IndexAggregate.patch text/x-patch 81.3 KB
v4-0001-add-in-memory-T-tree-tuple-index.patch text/x-patch 36.7 KB
v4-0002-introduce-AGG_INDEX-grouping-strategy-node.patch text/x-patch 81.1 KB
v4-0003-make-use-of-IndexAggregate-in-planner-and-explain.patch text/x-patch 20.9 KB
v4-0004-add-support-for-Partial-IndexAggregate.patch text/x-patch 14.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuya Kawata 2026-01-04 13:37:02 Re: [Patch]Add tab completion for DELETE ... USING
Previous Message Tatsuo Ishii 2026-01-04 12:28:25 Re: Row pattern recognition