Re: Partial aggregates pushdown

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Subject: Re: Partial aggregates pushdown
Date: 2023-11-20 22:48:42
Message-ID: ZVviSttXc7kff_I5@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 20, 2023 at 03:51:33PM -0500, Robert Haas wrote:
> On Mon, Nov 13, 2023 at 3:26 AM Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp
> <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp> wrote:
> > In postgres_fdw.sql, I have corrected the output format for floating point numbers
> > by extra_float_digits.
>
> Looking at this, I find that it's not at all clear to me how the
> partial aggregate function is defined. Let's look at what we have for
> documentation:
>
> + <para>
> + Paraemter <literal>AGGPARTIALFUNC</literal> optionally defines a
> + partial aggregate function used for partial aggregate pushdown; see
> + <xref linkend="xaggr-partial-aggregates"/> for details.
> + </para>
>
> + Partial aggregate function (zero if none).
> + See <xref linkend="partial-aggregate-pushdown"/> for the definition
> + of partial aggregate function.
>
> + Partial aggregate pushdown is an optimization for queries that contains
> + aggregate expressions for a partitioned table across one or more remote
> + servers. If multiple conditions are met, partial aggregate function
>
> + When partial aggregate pushdown is used for aggregate expressions,
> + remote queries replace aggregate function calls with partial
> + aggregate function calls. If the data type of the state value is not
>
> But there's no definition of what the behavior of the function is
> anywhere that I can see, not even in <sect2
> id="partial-aggregate-pushdown">. Everywhere it only describes how the
> partial aggregate function is used, not what it is supposed to do.

Yes, I had to figure that out myself, and I was wondering how much
detail to have in our docs vs README files vs. C comments. I think we
should put more details somewhere.

> Looking at the changes in pg_aggregate.dat, it seems like the partial
> aggregate function is a second aggregate defined in a way that mostly
> matches the original, except that (1) if the original final function
> would have returned a data type other than internal, then the final
> function is removed; and (2) if the original final function would have
> returned a value of internal type, then the final function is the
> serialization function of the original aggregate. I think that's a
> reasonable definition, but the documentation and code comments need to
> be a lot clearer.

Agreed. I wasn't sure enough about this to add it when I was reviewing
the patch.

> I do have a concern about this, though. It adds a lot of bloat. It
> adds a whole lot of additional entries to pg_aggregate, and every new
> aggregate we add in the future will require a bonus entry for this,
> and it needs a bunch of new pg_proc entries as well. One idea that
> I've had in the past is to instead introduce syntax that just does
> this, without requiring a separate aggregate definition in each case.
> For example, maybe instead of changing string_agg(whatever) to
> string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
> string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
> something. Then all aggregates could be treated in a generic way. I'm
> not completely sure that's better, but I think it's worth considering.

So use an SQL keyword to indicates a pushdown call? We could then
automate the behavior rather than requiring special catalog functions?

> I think that the control mechanism needs some thought. Right now,
> there are two possible behaviors: either we assume that the local and
> remote sides are the same unconditionally, or we assume that they're
> the same if the remote side is a new enough version. I do like having
> those behaviors available, but I wonder if we need to do something
> better or different. What if somebody wants to push down a
> non-built-in aggregate, for example? I realize that we don't have

It does allow specification of extensions that can be pushed down.

> great solutions to the problem of knowing which functions are
> push-downable in general, and I don't know that partial aggregation
> needs to be any better than anything else, but it's probably worth
> comparing and contrasting the approach we take here with the
> approaches we've taken in other, similar cases. From that point of
> view, I think check_partial_aggregate_support is a novelty: we don't
> do those kinds of checks in other cases, AFAIK. But on the other hand,
> there is the 'extensions' argument to postgres_fdw.

Right. I am not sure how to improve what the patch does.

> I don't think the patch does a good job explaining why HAVING,
> DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
> shouldn't really be a problem, because HAVING is basically a WHERE
> clause that occurs after aggregation is complete, and whether or not
> the aggregation is safe shouldn't depend on what we're going to do
> with the value afterward. The HAVING clause can't necessarily be
> pushed to the remote side, but I don't see how or why it could make
> the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
> little trickier: if we pushed down DISTINCT, we'd still have to
> re-DISTINCT-ify when combining locally, and if we pushed down ORDER
> BY, we'd have to do a merge pass to combine the returned values unless
> we could prove that the partitions were non-overlapping ranges that
> would be visited in the correct order. Although that all sounds
> doable, I think it's probably a good thing that the current patch
> doesn't try to handle it -- this is complicated already. But it should
> explain why it's not handling it and maybe even a bit about how it
> could be handling in the future, rather than just saying "well, this
> kind of thing is not safe." The trouble with that explanation is that
> it does nothing to help the reader understand whether the thing in
> question is *fundamentally* unsafe or whether we just don't have the
> right code to make it work.

Makes sense.

> Typo: Paraemter
>
> I'm so sorry to keep complaining about comments, but I think the
> comments in src/backend/optimizer are very far from being adequate.
> They are strictly formulaic and don't really explain anything. For
> example, I see that the patch adds a partial_target to
> GroupPathExtraData, but how do I understand the reason why we now need
> a second pathtarget beside the one that already exists? Certainly not
> from the comments in setGroupClausePartial, because there basically
> aren't any. True, there's a header comment, but it just says we
> generate this thing, not WHY we generate this thing. There's nothing
> meaningful to be found in src/include/nodes/pathnodes.h about why
> we're doing this, either.
>
> And this problem really extends throughout the patch: comments are
> mostly short and just describe what the code does, not WHY it does
> that. And the WHY is really the important part. Otherwise we will not
> be able to maintain this code going forward.

Understood. I wish I knew enough to add them myself. I can help if
someone can supply the details.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-11-20 22:52:22 Re: Hide exposed impl detail of wchar.c
Previous Message Jeff Davis 2023-11-20 22:48:27 Re: PANIC serves too many masters