Re: Partial aggregates pushdown

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Partial aggregates pushdown
Date: 2024-03-19 21:29:07
Message-ID: 4020145.1710883747@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> The current patch has:

> if ((OidIsValid(aggform->aggfinalfn) ||
> (aggform->aggtranstype == INTERNALOID)) &&
> fpinfo->check_partial_aggregate_support)
> {
> if (fpinfo->remoteversion == 0)
> {
> PGconn *conn = GetConnection(fpinfo->user, false, NULL);

> fpinfo->remoteversion = PQserverVersion(conn);
> }

> if (fpinfo->remoteversion < PG_VERSION_NUM)
> partial_agg_ok = false;
> }

> It uses check_partial_aggregate_support, which defaults to false,
> meaning partial aggregates will be pushed down with no version check by
> default. If set to true, pushdown will happen if the remote server is
> the same version or newer, which seems acceptable to me.

I'd like to vociferously protest both of those decisions.

"No version check by default" means "unsafe by default", which is not
project style in general and is especially not so for postgres_fdw.
We have tried very hard for years to ensure that postgres_fdw will
work with a wide range of remote server versions, and generally been
extremely conservative about what we think will work (example:
collations); but this patch seems ready to throw that principle away.

Also, surely "remoteversion < PG_VERSION_NUM" is backwards. What
this would mean is that nobody can ever change a partial aggregate's
implementation, because that would break queries issued from older
servers (that couldn't know about the change) to newer ones.

Realistically, I think it's fairly unsafe to try aggregate pushdown
to anything but the same PG major version; otherwise, you're buying
into knowing which aggregates have partial support in which versions,
as well as predicting the future about incompatible state changes.
Even that isn't bulletproof --- e.g, maybe somebody wasn't careful
about endianness-independence of the serialized partial state, making
it unsafe to ship --- so there had better be a switch whereby the user
can disable it.

Maybe we could define a three-way setting:

* default: push down partial aggs only to same major PG version
* disable: don't push down, period
* force: push down regardless of remote version

With the "force" setting, it's the user's responsibility not to
issue any remote-able aggregation that would be unsafe to push
down. This is still a pretty crude tool: I can foresee people
wanting to have per-aggregate control over things, especially
extension-supplied aggregates. But it'd do for starters.

I'm not super thrilled by the fact that the patch contains zero
user-facing documentation, even though it's created new SQL syntax,
not to mention a new postgres_fdw option. I assume this means that
nobody thinks it's anywhere near ready to commit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-03-19 21:39:39 Re: documentation structure
Previous Message Dean Rasheed 2024-03-19 21:06:03 Re: Improving EXPLAIN's display of SubPlan nodes