RE: Add enable_groupagg GUC parameter to control GroupAggregate usage

From: Tatsuro Yamada <tatsuro(dot)yamada(at)ntt(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, Tatsuro Yamada <yamatattsu(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Add enable_groupagg GUC parameter to control GroupAggregate usage
Date: 2026-06-25 09:11:59
Message-ID: TY4P301MB129931A356F6934C432BB0259EEC2@TY4P301MB1299.JPNP301.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Richard!

> > What do others involved in planner and plan-tuning think?
>
> I think this is a nice idea. It is true that we can sometimes use
> enable_sort to push the planner off a sorted grouping plan, but it is
> a much bigger hammer, as it discourages every Sort in the query, not
> just the one under the grouping, so it can move other parts of the
> plan around and doesn't really isolate the grouping choice. And it
> doesn't always work anyway. If the grouping input happens to be
> cheaply sorted there's no Sort to penalize, so the sorted plan
> survives. I ran into exactly that in union.sql, where enable_sort=off
> isn't really producing the hashed INTERSECT it looks like it is
> testing. Look at this plan from union.out:
>
> -- check hashed implementation
> set enable_hashagg = true;
> set enable_sort = false;
>
> explain (costs off)
> select from generate_series(1,5) intersect select from generate_series(1,3);
> QUERY PLAN
> ----------------------------------------------------------
> SetOp Intersect
> -> Function Scan on generate_series
> -> Function Scan on generate_series generate_series_1
> (3 rows)
>
> Also, without such a GUC, there are some paths that we have no way to
> generate. Look at this comment in union.sql:
>
> -- We've no way to check hashed UNION as the empty pathkeys in the
> Append are
> -- fine to make use of Unique, which is cheaper than HashAggregate and
> we've
> -- no means to disable Unique.

Thank you for your comments!
I'm glad that you appreciate the value of the patch.

Regarding the comments,
I hadn't realized that there were other plan nodes besides GroupAgg that
would also benefit from this GUC. I believe this feature would be useful to
many users. I also think it could be valuable for testing and tuning tools
such as pg_plan_advice, developed by Robert. For that reason, I'd like to
continue improving the patch with the goal of getting it committed for PG20.

> Regarding the patch, currently the new GUC only covers GroupAggregate.
> I think we should make it to cover all sort-based equivalents of
> enable_hashagg, such as the sorted mode of SetOp (as David suggested),
> sort-based Unique step used for DISTINCT and semijoin
> unique-ification, and maybe also Group nodes.

It sounds like there are several plan nodes that could be covered by this GUC
(such as SetOp, sort-based Unique for DISTINCT, semijoin uniqueification,
and Group nodes). Do you think we should cover all of them before the patch
would be considered complete enough for commit?

I plan to work through them one by one, but if others are interested in
contributing, I would certainly welcome the collaboration. That would help
move the patch forward more quickly and allow us to gather feedback on
different parts of the design in parallel.

As a practical matter, only part of my time can be devoted to community
development work. If I end up handling everything myself, progress may be
slower, and I'd like to avoid the patch losing momentum.
A related challenge is that extending the patch to cover more plan nodes will
likely increase the scope of the regression test changes. Perhaps it would
make sense to split the work into smaller patches, allowing each set of
regression test changes to be reviewed independently.

I'd be interested to hear your thoughts on how best to proceed, and whether
dividing the work into smaller pieces would make sense.

Regards,
Tatsuro Yamada

> - Richard

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2026-06-25 09:16:22 Re: Support EXCEPT for ALL SEQUENCES publications
Previous Message Richard Guo 2026-06-25 08:49:31 plpython: NULL pointer dereference on broken sequence objects