Quick Links

Re: Introduce Index Aggregate - new GROUP BY strategy

From:	Sergey Soloviev <sergey(dot)soloviev(at)tantorlabs(dot)ru>
To:	Andrei Lepikhov <lepihov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Introduce Index Aggregate - new GROUP BY strategy
Date:	2026-02-04 17:33:44
Message-ID:	2b06b055-7f0d-42a7-ac0b-983ee92e239f@tantorlabs.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi!

> 2. Consider splitting the hash_* → spill_* field renaming into a separate preparatory commit
> to reduce the complexity of reviewing the core logic changes.

Here patches rebased to master. I've managed to move renaming part into different patch.
Also, I improved the accuracy of the planner in determining the required memory - it counts
total required index nodes and calculates amount of memory for them (internal and leaf
separately).

Patches in attachments.

> 3. I notice AGG_INDEX requires both sortable AND hashable types. While I understand this
> is for the hash-based spill partitioning, is this limitation necessary? Could you use sort-based
> spilling (similar to tuplesort's external merge) instead? This would allow AGG_INDEX to work
> with sortable-only types (I can imagine a geometric type with B-tree operators but no hash functions).

I thought about this idea and came to the conclusion, that this should be additional behaviour -
when the type is not hashable. Because if we allow only sortable types, then we have to choose
what to do to support memory limits:

1. Dump all tuples not present in index to disk
2. On overflow compute partial aggregates and at the end perform final merge/combine

Also, at the 1 case I am not considering sorting tuples, because otherwise what we get is plain
Sort/Group pair. By using hash-partitioning we improve performance, because all same tuples
will belong to the same bucket.

In case 2 we imply restriction on aggregation function itself, because not every aggregate has
combine function.

In the end, I haven't come to a decision on which option is better, so I will leave it as it is for now.

---
Sergey Soloviev

TantorLabs: https://tantorlabs.com

Attachment	Content-Type	Size
v4-0001-add-in-memory-btree-tuple-index.patch	text/x-patch	23.6 KB
v4-0002-change-field-prefix-from-hash_-to-spill_-in-AggSt.patch	text/x-patch	24.8 KB
v4-0003-introduce-AGG_INDEX-grouping-strategy-node.patch	text/x-patch	76.3 KB
v4-0004-make-use-of-IndexAggregate-in-planner-and-explain.patch	text/x-patch	27.3 KB
v4-0005-add-support-for-Partial-IndexAggregate.patch	text/x-patch	14.3 KB
v4-0006-fix-tests-for-IndexAggregate.patch	text/x-patch	84.2 KB

In response to

Re: Introduce Index Aggregate - new GROUP BY strategy at 2026-01-04 13:09:27 from Sergey Soloviev

Browse pgsql-hackers by date

	From	Date	Subject
Previous Message	David G. Johnston	2026-02-04 17:02:17	Re: Docs: Use non-default throughout the documentation