Quick Links

Remove redundant DISTINCT when GROUP BY already guarantees uniqueness

From:	Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To:	PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Remove redundant DISTINCT when GROUP BY already guarantees uniqueness
Date:	2026-06-08 11:12:20
Message-ID:	faf8f63f-b476-4ca8-89f9-7cfc104b522c@tantorlabs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers,

When a query contains both DISTINCT and GROUP BY, planner currently
always creates a separate plan node to eliminate duplicate output rows.
However, if GROUP BY is present, each output row already corresponds to
exactly one group - meaning the rows produced by GROUP BY are already
unique on their group keys. Any DISTINCT clause that covers all those
group keys therefore adds no filtering and the de-duplication step is
pure overhead.

This patch teaches the planner to detect this situation and skip
creating the DISTINCT paths entirely. The optimization applies when
three conditions hold:
- the query uses a plan GROUP BY (not GROUPING SETS, which can produce
extra NULL-filled rows across sets),
- the clause is plain DISTINCT (not DISTINCT ON, which has different
semantics),
- and every GROUP BY key appears in the DISTINCT clause.

The last condition is necessary because if a GROUP BY key is absent from
the DISTINCT list, two different groups could still produce identical
output tuple (e.g. `SELECT DISTINCT 1 FROM t GROUP BY a`)

Example:

```
CREATE TABLE t (a INT, b INT, c INT);
INSERT INTO t VALUES (1,1,10),(1,2,20),(2,1,30),(2,2,40),(1,1,50);
ANALYZE t;

EXPLAIN (COSTS OFF) SELECT DISTINCT a, b, sum(c) FROM t GROUP BY a, b;
```

Before patch:
```
QUERY PLAN
----------------------------------
Unique
-> Sort
Sort Key: a, b, (sum(c))
-> HashAggregate
Group Key: a, b
-> Seq Scan on t
(6 rows)
```

After patch
```
QUERY PLAN
---------------------
HashAggregate
Group Key: a, b
-> Seq Scan on t
(3 rows)
```

Any feedback or suggestions for additional cases are welcome.

--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

Attachment	Content-Type	Size
v1-0001-Remove-redundant-DISTINCT-when-GROUP-BY-guarantee.patch	text/x-patch	9.8 KB

Browse pgsql-hackers by date

	From	Date	Subject
Previous Message	Tomas Vondra	2026-06-08 11:04:47	Re: Subquery pull-up increases jointree search space