Re: Eager aggregation, take 3

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tender Wang <tndrwang(at)gmail(dot)com>, Paul George <p(dot)a(dot)george19(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
Subject: Re: Eager aggregation, take 3
Date: 2025-09-09 09:20:21
Message-ID: CAMbWs4_2BzuAX+BSO1p7rtUwmQjORrG-b906Cw-RkfRjFP0oSQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 5, 2025 at 10:12 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Aug 6, 2025 at 3:52 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > What we really want to exclude are aggregate functions that can
> > produce large transition values by accumulating or concatenating input
> > rows. So I'm wondering if we could instead check the transfn_oid
> > directly and explicitly exclude only F_ARRAY_AGG_TRANSFN and
> > F_STRING_AGG_TRANSFN. We don't need to worry about json_agg,
> > jsonb_agg, or xmlagg, since they don't support partial aggregation
> > anyway.

> This strategy seems fairly unfriendly towards out-of-core code. Can
> you come up with something that allows the author of a SQL-callable
> function to include or exclude the function by a choice that is under
> their control, rather than hard-coding something in PostgreSQL itself?

Yeah, ideally we should tell whether an aggregate's transition state
may grow unbounded just by looking at system catalogs. Unfortunately,
after trying for a while, it seems to me that the current catalog
doesn't provide enough information.

I once considered adding a flag (e.g., aggtransbounded) to catalog
pg_aggregate to indicate whether the transition state size is bounded.
This flag could be specified by users when creating aggregate
functions, and then leveraged by features such as eager aggregation.

However, adding new information to system catalogs involves a lot of
discussions and changes, including updates to DDL commands, dump and
restore processes, and upgrade procedures. Therefore, to keep the
focus of this patch on the eager aggregation feature itself, I prefer
to treat this enhancement as future work.

- Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Geier 2025-09-09 09:22:32 Re: Use merge-based matching for MCVs in eqjoinsel
Previous Message shveta malik 2025-09-09 09:17:20 Re: Conflict detection for update_deleted in logical replication