From: | Tatsuro Yamada <yamatattsu(at)gmail(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Add enable_groupagg GUC parameter to control GroupAggregate usage |
Date: | 2025-06-11 03:07:44 |
Message-ID: | CAOKkKFsLbbpVMCG2_bV8u7M2sXLTs0y_yfMpywKekGi+qKtWKw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Ashutosh,
Thanks for your reply!
>I first thought enable_hashagg should be sufficient to choose one strategy
over the other. But that is not true, enable_hashagg = true allows both the
strategies, enable_hashagg = false disables just hash strategy. There's no
way to disable group agg alone. So I think it makes sense to have this GUC.
Yes, exactly! Thank you for the clarification.
>I am surprised that we didn't see this being a problem for so long.
I was also wondering why this GUC parameter hadn't been introduced
earlier.
>We seem to disable mixed strategy when enable_hashagg is false. Do we want
to do the same when enable_groupagg = false as well?
As I mentioned in my previous email, setting enable_groupagg = false
resulted in better execution time compared to the mixed strategy (which,
as I understand, includes both HashAgg and GroupAgg). So, I believe
this new GUC parameter would be helpful for users in certain situations.
Here’s how the current and proposed behaviors compare:
Current behavior:
enable_hashagg = ON → Mixed
enable_hashagg = OFF → GroupAgg
After applying the patch:
enable_hashagg = ON, enable_groupagg = ON → Mixed
enable_hashagg = OFF, enable_groupagg = ON → GroupAgg
enable_hashagg = ON, enable_groupagg = OFF → HashAgg (new)
enable_hashagg = OFF, enable_groupagg = OFF → GroupAgg
In addition, if the both parameters = OFF, GroupAgg will be selected in
the patch. The reason is that I found a comment that HashAgg might use
too much memory, so I decided to prioritize using GroupAgg, which is
more secure.
In the last case, I chose to default to GroupAgg since I found a
comment suggesting HashAgg might consume excessive memory, so GroupAgg
seemed the safer fallback.
Some hackers may want to compare the actual execution plans and times
under different GUC settings, so I've attached my test results in
"test_result.txt".
>Some of those instances are for plan stability, all of which need not be
replicated. But some of them explicitly test sort based grouping. For rest
of them hash based plan seems to be the best one, so explicit
enable_groupagg = false is not needed. We will need some test to test the
switch though.
Thanks for your advice. I'll create a regression test and send a new patch
to -hackers in my next email.
Regards,
Tatsuro Yamada
Attachment | Content-Type | Size |
---|---|---|
test_result.txt | text/plain | 40.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Shinya Kato | 2025-06-11 03:29:26 | Re: Extend COPY FROM with HEADER <integer> to skip multiple lines |
Previous Message | Dilip Kumar | 2025-06-11 03:04:24 | Re: Proposal: Global Index for PostgreSQL |