Re: Add enable_groupagg GUC parameter to control GroupAggregate usage

From: Tatsuro Yamada <yamatattsu(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add enable_groupagg GUC parameter to control GroupAggregate usage
Date: 2025-06-11 03:07:44
Message-ID: CAOKkKFsLbbpVMCG2_bV8u7M2sXLTs0y_yfMpywKekGi+qKtWKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Ashutosh,

Thanks for your reply!

>I first thought enable_hashagg should be sufficient to choose one strategy
over the other. But that is not true, enable_hashagg = true allows both the
strategies, enable_hashagg = false disables just hash strategy. There's no
way to disable group agg alone. So I think it makes sense to have this GUC.

Yes, exactly! Thank you for the clarification.

>I am surprised that we didn't see this being a problem for so long.

I was also wondering why this GUC parameter hadn't been introduced
earlier.

>We seem to disable mixed strategy when enable_hashagg is false. Do we want
to do the same when enable_groupagg = false as well?

As I mentioned in my previous email, setting enable_groupagg = false
resulted in better execution time compared to the mixed strategy (which,
as I understand, includes both HashAgg and GroupAgg). So, I believe
this new GUC parameter would be helpful for users in certain situations.

Here’s how the current and proposed behaviors compare:

Current behavior:
enable_hashagg = ON → Mixed
enable_hashagg = OFF → GroupAgg

After applying the patch:
enable_hashagg = ON, enable_groupagg = ON → Mixed
enable_hashagg = OFF, enable_groupagg = ON → GroupAgg
enable_hashagg = ON, enable_groupagg = OFF → HashAgg (new)
enable_hashagg = OFF, enable_groupagg = OFF → GroupAgg

In addition, if the both parameters = OFF, GroupAgg will be selected in
the patch. The reason is that I found a comment that HashAgg might use
too much memory, so I decided to prioritize using GroupAgg, which is
more secure.

In the last case, I chose to default to GroupAgg since I found a
comment suggesting HashAgg might consume excessive memory, so GroupAgg
seemed the safer fallback.

Some hackers may want to compare the actual execution plans and times
under different GUC settings, so I've attached my test results in
"test_result.txt".

>Some of those instances are for plan stability, all of which need not be
replicated. But some of them explicitly test sort based grouping. For rest
of them hash based plan seems to be the best one, so explicit
enable_groupagg = false is not needed. We will need some test to test the
switch though.

Thanks for your advice. I'll create a regression test and send a new patch
to -hackers in my next email.

Regards,
Tatsuro Yamada

Attachment Content-Type Size
test_result.txt text/plain 40.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinya Kato 2025-06-11 03:29:26 Re: Extend COPY FROM with HEADER <integer> to skip multiple lines
Previous Message Dilip Kumar 2025-06-11 03:04:24 Re: Proposal: Global Index for PostgreSQL