Re: Partial aggregates pushdown

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Ilya Gladyshev <i(dot)gladyshev(at)postgrespro(dot)ru>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: Partial aggregates pushdown
Date: 2021-11-01 23:49:08
Message-ID: 46267dde-a1c6-6d73-c289-2720f275b4f1@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/1/21 22:53, Ilya Gladyshev wrote:
>
> On 01.11.2021 13:30, Alexander Pyhalov wrote:
>> Peter Eisentraut писал 2021-11-01 12:47:
>>> On 21.10.21 12:55, Alexander Pyhalov wrote:
>>>> Now aggregates with internal states can be pushed down, if they are
>>>> marked as pushdown safe (this flag is set to true for min/max/sum),
>>>> have internal states and associated converters. Converters are
>>>> called locally, they transform aggregate result to serialized
>>>> internal representation.
>>>> As converters don't have access to internal aggregate state, partial
>>>> aggregates like avg() are still not pushable.
>>>
>>> It seems to me that the system should be able to determine from the
>>> existing aggregate catalog entry whether an aggregate can be pushed
>>> down.  For example, it could check aggtranstype != internal and
>>> similar.  A separate boolean flag should not be necessary.
>>
>> Hi.
>> I think we can't infer this property from existing flags. For example,
>> if I have avg() with bigint[] argtranstype, it doesn't mean we can
>> push down it. We couldn't also decide if partial aggregete is safe to
>> push down based on aggfinalfn presence (for example, it is defined for
>> sum(numeric), but we can push it down.
>
> I think one potential way to do it would be to allow pushing down
> aggregates that EITHER have state of the same type as their return type,
> OR have a conversion function that converts their return value to the
> type of their state.
>

IMO just checking (aggtranstype == result type) entirely ignores the
issue of portability - we've never required the aggregate state to be
portable in any meaningful way (between architectures, minor/major
versions, ...) and it seems foolish to just start relying on it here.

Imagine for example an aggregate using bytea state, storing some complex
C struct in it. You can't just copy that between architectures.

It's a bit like why we don't simply copy data types to network, but pass
them through input/output or send/receive functions. The new flag is a
way to mark aggregates where this is safe, and I don't think we can do
away without it.

The more I think about this, the more I'm convinced the proper way to do
this would be adding export/import functions, similar to serial/deserial
functions, with the extra portability guarantees. And we'd need to do
that for all aggregates, not just those with (aggtranstype == internal).

I get it - the idea of the patch is that keeping the data types the same
makes it much simpler to pass the aggregate state (compared to having to
export/import it). But I'm not sure it's the right approach.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-11-02 00:57:48 Re: inefficient loop in StandbyReleaseLockList()
Previous Message David Zhang 2021-11-01 22:46:39 Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit