| From: | Tender Wang <tndrwang(at)gmail(dot)com> |
|---|---|
| To: | ihalatci(at)gmail(dot)com |
| Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org, Matheus Alcantara <matheusssilv97(at)gmail(dot)com> |
| Subject: | Re: BUG #19517: Eager Aggregation produces wrong count(*) when pushed into RHS of Hash Semi Join |
| Date: | 2026-06-11 01:03:22 |
| Message-ID: | CAHewXNm8cGaSTXuZgmJLhXYHsatqhAuOpo5fnCSQQog3aRhFrA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi all
Matheus Alcantara <matheusssilv97(at)gmail(dot)com> 于2026年6月11日周四 04:19写道:
>
> On Wed Jun 10, 2026 at 12:44 PM -03, PG Bug reporting form wrote:
> > Likely cause
> > ------------
> > Eager Aggregation appears to consider the RHS of a SEMI/ANTI join a legal
> > site to push a partial aggregate, but for SEMI joins each outer row matches
> > at most one inner row by the join's own semantics, so partial counts taken
> > on the inner side cannot be combined into a correct outer-side count(*).
> > Either eager aggregation must be disabled for the RHS of SEMI/ANTI joins,
> > or the partial state must be projected through the semi-join such that
> > each surviving outer row contributes exactly one partial.
> >
> > Two follow-up commits to eager aggregation landed on 2026-04-06
> > ("Fix volatile function evaluation in eager aggregation" and
> > "Fix collation handling for grouping keys in eager aggregation"), but
> > neither addresses semi-join correctness.
> >
>
> Wondering if commit ffeda04259b (Fix eager aggregation for semi/antijoin
> inner rels) is about fixing this issue. Can you please check it? I think
> that 19beta1 was created before the fix was applied on master.
>
Yes, this commit ffeda04259b has fixed this problem. But it wasn't
included in the 19beta1.
I tried on HEAD, and the result was correct.
postgres=# SELECT u.user_id, count(*)
FROM u
WHERE u.value_2 = ANY (SELECT e.value_2
FROM e
WHERE u.user_id > e.user_id)
GROUP BY u.user_id;
user_id | count
---------+-------
5 | 26
(1 row)
postgres=# explain SELECT u.user_id, count(*)
FROM u
WHERE u.value_2 = ANY (SELECT e.value_2
FROM e
WHERE u.user_id > e.user_id)
GROUP BY u.user_id;
QUERY PLAN
---------------------------------------------------------------------
HashAggregate (cost=4.76..4.77 rows=1 width=12)
Group Key: u.user_id
-> Hash Semi Join (cost=3.25..4.71 rows=9 width=4)
Hash Cond: (u.value_2 = e.value_2)
Join Filter: (u.user_id > e.user_id)
-> Seq Scan on u (cost=0.00..1.26 rows=26 width=8)
-> Hash (cost=2.00..2.00 rows=100 width=8)
-> Seq Scan on e (cost=0.00..2.00 rows=100 width=8)
(8 rows)
--
Thanks,
Tender Wang
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-06-11 01:48:30 | Re: [PATCH] contrib/xml2: backend crash in xpath_nodeset() on the namespace axis |
| Previous Message | Andrey Chernyy | 2026-06-11 00:14:36 | [PATCH] contrib/xml2: backend crash in xpath_nodeset() on the namespace axis |