Re: Missing Group Key in grouped aggregate

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Erik Nordström <erik(at)timescale(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Missing Group Key in grouped aggregate
Date: 2024-02-20 18:56:09
Message-ID: e7f4da44-826e-47d3-9edd-ad31ff831a90@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/20/24 16:53, Tom Lane wrote:
> =?UTF-8?Q?Erik_Nordstr=C3=B6m?= <erik(at)timescale(dot)com> writes:
>> I noticed that, beginning with PG16, grouped aggregates are missing the
>> "Group Key" in the EXPLAIN output.
>
>> It seems the Agg node has numCols (number of grouping cols) set to zero in
>> queries like
>
>> SELECT foo, count(*) FROM bar WHERE foo=1 GROUP BY foo;
>
>> In PG15, the "Group Key" is shown and the Agg node has numCols set as
>> expected.
>
> Looks sane to me: the planner now notices that there can only
> be one group so it doesn't tell the GroupAgg node to worry about
> making groups. If it were missing in a case where there could be
> multiple output groups, yes that'd be a bug.
>
> If you want to run it to ground you could bisect to see where the
> behavior changed, but you'd probably just find it was intentional.
>

I believe this changed in:

commit 8d83a5d0a2673174dc478e707de1f502935391a5
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Wed Jan 18 12:37:57 2023 -0500

Remove redundant grouping and DISTINCT columns.

Avoid explicitly grouping by columns that we know are redundant
for sorting, for example we need group by only one of x and y in
SELECT ... WHERE x = y GROUP BY x, y
This comes up more often than you might think, as shown by the
changes in the regression tests. It's nearly free to detect too,
since we are just piggybacking on the existing logic that detects
redundant pathkeys. (In some of the existing plans that change,
it's visible that a sort step preceding the grouping step already
didn't bother to sort by the redundant column, making the old plan
a bit silly-looking.)

...

It's not quite obvious from the commit message, but that's where git
bisect says the behavior changed.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilshod Urazov 2024-02-20 19:19:21 Re: Proposal: Adjacent B-Tree index
Previous Message Andy Fan 2024-02-20 18:38:08 Re: Shared detoast Datum proposal