RE: Partial aggregates pushdown

From: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Ilya Gladyshev <i(dot)gladyshev(at)postgrespro(dot)ru>, "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp>
Subject: RE: Partial aggregates pushdown
Date: 2023-06-06 03:08:50
Message-ID: OS3PR01MB66604E404B003A621B0C03A69552A@OS3PR01MB6660.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Mr.Pyhalov.

Thank you for your always thoughtful review.

> From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
> Sent: Monday, June 5, 2023 6:00 PM
> Have found one issue -
>
> src/backend/catalog/pg_aggregate.c
>
> 585 if(strcmp(strVal(linitial(aggpartialfnName)),
> aggName) == 0){
> 586 if(((aggTransType != INTERNALOID) &&
> (finalfn != InvalidOid))
> 587 || ((aggTransType ==
> INTERNALOID) && (finalfn != serialfn)))
> 588 elog(ERROR, "%s is not its own
> aggpartialfunc", aggName);
> 589 } else {
>
> Here string comparison of aggName and aggpartialfnName looks very
> suspicios - it seems you should compare oids, not names (in this case,
> likely oids of transition function and partial aggregation function).
> The fact that aggregate name matches partial aggregation function name
> is not a enough to make any decisions.

I see no problem with this string comparison. Here is the reason.
The intent of this code is, to determine whether the user specifies
the new aggregate function whose aggpartialfunc is itself.
For two aggregate functions,
If the argument list and function name match, then the two aggregate functions must match.
By definition of aggpartialfunc,
every aggregate function and its aggpartialfn must have the same argument list.
Thus, if aggpartialfnName and aggName are equal as strings,
I think it is correct to determine that the user is specifying
the new aggregate function whose aggpartialfunc is itself.

However, since the document does not state these intentions
I think your suspicions are valid.
Therefore, I have added a specification to the document reflecting the above intentions.

> From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
> Sent: Monday, June 5, 2023 6:00 PM
> In documentation
>
> doc/src/sgml/postgres-fdw.sgml:
>
> 930 <filename>postgres_fdw</filename> attempts to optimize remote
> queries to reduce
> 931 the amount of data transferred from foreign servers. This is
> done by
> 932 sending query <literal>WHERE</literal> clauses and ggregate
> expressions
> 933 to the remote server for execution, and by not retrieving table
> columns that
> 934 are not needed for the current query.
> 935 To reduce the risk of misexecution of queries,
> 936 <literal>WHERE</literal> clauses and ggregate expressions are
> not sent to
> 937 the remote server unless they use only data types, operators,
> and functions
> 938 that are built-in or belong to an extension that's listed in the
> foreign
> 939 server's <literal>extensions</literal> option.
> 940 Operators and functions in such clauses must
> 941 be <literal>IMMUTABLE</literal> as well.
>
> there are misprints in lines 932 and 936 - missing "a" in "aggregate"
> expressions.

Fixed.

> From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
> Sent: Monday, June 5, 2023 6:00 PM
> Note that after these changes "select sum()" will fail for certain
> cases, when remote server version is not the latest. In other cases we
> tried
> to preserve compatibility. Should we have a switch for a foreign server
> to turn this optimization off? Or do we just state that users
> should use different workarounds if remote server version doesn't match
> local one?

It is the latter.
I added description about the above limitation to F.38.6. Built-in sharding in PostgreSQL and
F.38.8 Cross-Version Compatibility of doc/src/sgml/postgres-fdw.sgml.

> From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
> Sent: Tuesday, June 6, 2023 3:15 AM
> Bruce Momjian писал 2023-06-05 19:26:
> > On Mon, Jun 5, 2023 at 12:00:27PM +0300, Alexander Pyhalov wrote:
> >> Note that after these changes "select sum()" will fail for certain
> >> cases, when remote server version is not the latest. In other cases
> >> we tried to preserve compatibility. Should we have a switch for a
> >> foreign server to turn this optimization off? Or do we just state
> >> that users should use different workarounds if remote server version
> >> doesn't match local one?
> >
> > We covered this in April in this and previous emails:
> >
> >
> https://www.postgresql.org/message-id/ZDGTza4rovCa%2BN3d%40
> momjian.us
> >
> > We don't check the version number for _any_ builtin functions so why
> > would we need to check for aggregate pushdown? Yes, these will be new
> > functions in PG 17, we have added functions regularly in major
> > releases and have never heard reports of problems about that.
> >
> Hi.
>
> I've seen this message. But introduction of new built-in function will break
> requests to old servers only if this new function is used in the request (this
> means that query changes). However, this patch changes the behavior of old
> queries, which worked prior to update. This seems to be different to me.

You are right.
However, for now, partial aggregates pushdown is mainly used when using built-in sharding in PostgreSQL.
I believe when using built-in sharding in PostgreSQL, the version of the primary node server and
the version of the remote server will usually be the same.
So I think it would be sufficient to include in the documentation a note about such problem
and a workaround for them.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Attachment Content-Type Size
0001-Partial-aggregates-push-down-v19.patch application/octet-stream 232.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Fearing 2023-06-06 03:13:52 Add support for AT LOCAL
Previous Message Amit Kapila 2023-06-06 02:35:49 Re: Implement generalized sub routine find_in_log for tap test