| From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
|---|---|
| To: | Tender Wang <tndrwang(at)gmail(dot)com> |
| Cc: | Radim Marek <radim(at)boringsql(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Eager aggregation, take 3 |
| Date: | 2026-06-01 07:19:03 |
| Message-ID: | CAMbWs49zScoONs0151cmgoo5cttkkK2cGLx228+Wok-h8XoyOg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, May 31, 2026 at 8:28 PM Tender Wang <tndrwang(at)gmail(dot)com> wrote:
> Radim Marek <radim(at)boringsql(dot)com> 于2026年5月29日周五 23:55写道:
> > == How to reproduce
> >
> > CREATE TEMP TABLE c(id int, country text);
> > CREATE TEMP TABLE o(customer_id int);
> > INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE');
> > INSERT INTO o VALUES (1),(3); -- only customers 1 and 3 have a row in o
> >
> > SELECT c.country, count(*) AS n
> > FROM c
> > WHERE NOT EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id)
> > GROUP BY c.country
> > ORDER BY c.country;
> >
> > Expected results (everywhere except master)
> >
> > country | n
> > ---------+---
> > DE | 2
> > US | 1
> > (2 rows)
> >
> > The actual result with enable_eager_aggregate = on (default)
> >
> > country | n
> > ---------+---
> > DE | 0
> > US | 0
> > (2 rows)
Thanks for the report. This is a bug. We should never push a partial
aggregation down to a relation on the inner (RHS) side of a semi/anti
join. A semi/anti join does not preserve its inner rows in the join
output, so a partial aggregate computed on the inner side would not
survive the join and could not be combined by the final aggregation.
> I haven't thought about it too deeply yet. Maybe we can do something
> in the make_grouped_join_rel().
> ...
> if (sjinfo->jointype == JOIN_ANTI || sjinfo->jointype == JOIN_SEMI)
> return;
> ...
That does fix the reported case, but I think it's too broad: it also
disables pushing a partial aggregate to the outer side of a semi/anti
join, which is valid. And by the time we reach make_grouped_join_rel
the grouped relation for the inner-side relation has already been
built, so it would just go unused.
So I'd rather fix it in eager_aggregation_possible_for_relation, right
next to the existing outer-join check, by rejecting a relation that
lies on the inner side of a semijoin/antijoin. See attached.
- Richard
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Fix-eager-aggregation-for-semi-antijoin-inner-rel.patch | application/octet-stream | 8.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-06-01 07:24:07 | Re: Support EXCEPT for TABLES IN SCHEMA publications |
| Previous Message | Chengpeng Yan | 2026-06-01 06:32:20 | Re: Extended statistics improvement: multi-column MCV missing values |