Re: Partial aggregates pushdown

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: Partial aggregates pushdown
Date: 2023-11-27 20:59:29
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Also, I want to make one other point here about security and
> reliability. Right now, there is no way for a user to feed arbitrary
> data to a deserialization function. Since serialization and
> deserialization functions are only used in the context of parallel
> query, we always know that the data fed to the deserialization
> function must have come from the serialization function on the same
> machine. Nor can users call the deserialization function directly with
> arbitrary data of their own choosing, because users cannot call
> functions that take or return internal. But with this system, it
> becomes possible to feed arbitrary data to a deserialization function.

Ouch. That is absolutely horrid --- we have a lot of stuff that
depends on users not being able to get at "internal" values, and
it sounds like the current proposal breaks all of that.

Quite aside from security concerns, there is no justification for
assuming that the "internal" values used on one platform/PG version
are identical to those used on another. So if the idea is to
ship back "internal" values from the remote server to the local one,
I think it's basically impossible to make that work.

Even if the partial-aggregate serialization value isn't "internal"
but some more-narrowly-defined type, it is still an internal
implementation detail of the aggregate. You have no right to assume
that the remote server implements the aggregate the same way the
local one does. If we start making such an assumption then we'll
be unable to revise the implementation of an aggregate ever again.

TBH, I think this entire proposal is dead in the water. Which is
sad from a performance standpoint, but I can't see any way that
we would not regret shipping a feature that makes such assumptions.

regards, tom lane

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-11-27 21:00:30 Re: locked reads for atomics
Previous Message Jeff Davis 2023-11-27 20:39:57 Re: proposal: change behavior on collation version mismatch