Re: Partial aggregates pushdown

From: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, vignesh C <vignesh21(at)gmail(dot)com>, "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp>
Subject: Re: Partial aggregates pushdown
Date: 2024-02-22 07:20:45
Message-ID: TYAPR01MB5514F0CBD9CD4F84A261198195562@TYAPR01MB5514.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi. Mr.Haas, hackers.

I apologize for the significant delay since my last post.
I have conducted investigations and considerations regarding the remaining tasks as follows.
Would it be possible for you to review them?
In particular, could you please confirm if the approach mentioned in 1. is acceptable?
If there are no issues with the direction outlined in 1., I plan to make a simple prototype based on this approach.

1. Transmitting state value safely between machines
> From: Robert Haas <robertmhaas(at)gmail(dot)com>
> Sent: Wednesday, December 6, 2023 10:25 PM
> the problems around transmitting
> serialized bytea blobs between machines that can't be assumed to fully trust each other will need to be addressed in some
> way, which seems like it will require a good deal of design work, forming some kind of consensus, and then implementation
> work to follow.
I have considered methods for safely transmitting state values between different machines.
I have taken into account the version policy of PostgreSQL (5 years of support) and the major version release cycle over the past 10 years (1 year), and as a result, I have made the assumption that transmission is allowed only when the difference between the local version and the remote version is 5 or less.
I believe that by adding new components, "export function" and "import function", to the aggregate functions, and further introducing a new SQL keyword to the query syntax of aggregate expressions, we can address this issue.
If the version of the local server is higher than or equal to the version of the remote server, the proposed method can be simplified. The export version mentioned later in (1) would not be necessary. Furthermore, if the version of the local server matches the version of the remote server, the proposed method can be further simplified.
I would appreciate your input on reasonable assumptions regarding the differences in versions between the local server and the remote server.
I will explain the specifications of the export function, import function, the new SQL keyword for aggregate expressions, and the behavior of query processing for partial aggregation separately.
(1) Export Function Specification
This function is another final function for partial aggregate.
This function converts the state value that represents the result of partial aggregation into a format that can be read by the local server.
This function is called instead of the existing finalfunc during the final stage of aggregation when performing partial aggregation.
The conversion process described above will be referred to as "export".
The argument of an export function is the version of the server that will receive the return value.
Hereafter, this version will be referred to as the export version.
The concept of an export version is necessary to handle cases where the version of the local server is smaller than the version of the remote server.
The return value of the export function is the transformed state value, and its data type is bytea.
For backward compatibility, the developer of the export function must ensure that the export can be performed for major versions up to five versions prior to the major version of PostgreSQL that the export function is being developed for.
For built-in functions, I believe it is necessary to allow for the possibility of not developing the export functionality for specific versions in the future (due to reasons such as development burden) after the export function is developed for a certain version.
To achieve this, for built-in functions, we will add a column to the pg_aggregate catalog that indicates the presence or absence of export functionality for each major version, including the major version being developed and the previous five major versions. This column will be named safety_export_versions and will have a data type of boolean[6].
For user-defined functions, we will refer to the extensions option and add an external server option called safety_export_extensions, which will maintain a list of extensions that include only the aggregate functions that can be exported to the local server version.

(2) Import Function Specification
The import function is a function that performs validity checks on the exported data and converts it into a state value. The process of this conversion is referred to as importing.
The import function is called from postgres_fdw in the local server.
The arguments of the import function are the exported data and the export version.
The return value of the import function is a state value that can be read on the local server.
The import function will terminate with an error if the validity check determines that the exported result cannot be read on the local server.
For backward compatibility, developers of the import function must ensure that it can be imported if the export version is up to five versions prior to their own version.

(3) The new SQL keyword for aggregate expressions
The local server passes the instructions for partial aggregation and the export version to the remote server using SQL keywords. The syntax specification is as follows:
aggregate_function(PARTIAL_AGGREGATE(export_version) expr)

Here, PARTIAL_AGGREGATE is a keyword that indicates partial aggregation, and export_version is a string constant that indicates the export version.

(4) The behavior of query processing for partial aggregation
I will illustrate the flow of query processing using the example query "select aggfunc(c) from t".
In the following explanation, the major version of the remote server will be referred to as remote_version, and the major version of the local server will be referred to as local_version.
STEP1. Checking the feasibility of partial aggregation pushdown on the local server
(i) Retrieving the remote_version
The postgres_fdw connects to the remote server and retrieves the remote_version.
(ii) Checking the versions
The postgres_fdw determines whether the difference between local_version and remote_version is within 5. If the difference is 6 or more, it is determined that partial aggregation pushdown is not possible.
(iii) Checking the import function
The postgres_fdw checks the pg_aggregate catalog to see if there is an import function for aggfunc. If there is none, it is determined that partial aggregation pushdown is not possible.
(iv) Checking the export function
If aggfunc is a built-in function, the postgres_fdw checks the pg_aggregate catalog. It checks if there is a version number export_version that satisfies the conditions local_version >= export_version >= local_version-5 and if there is an export function available for that version. If the version number export_version does not exist, it is determined that partial aggregation pushdown is not possible. This check is only performed if local_version >= remote_version.
If aggfunc is a user-defined function, the postgres_fdw checks if the extension on which aggfunc depends is included in export_safety_extensions. If it is not included, it is determined that partial aggregation pushdown is not possible.

STEP2. Sending a remote query on the local server
The query containing the keyword indicating partial aggregation is sent to the remote server. The remote query for the sample query would be as follows:
"select aggfunc(PARTIAL_AGGREGATE(export_version) c) from t"

STEP3. Executing the remote query on the remote server
The remote server performs partial aggregation for aggfunc. Instead of calling the final function at the last stage of aggregation, the remote server calls the export function with export_version and generates the return value of the partial aggregation.

STEP4. Receiving the result of the remote query on the local server
The postgres_fdw passes the export_version and the return value of STEP3 to the import function of aggfunc and receives the state value. The postgres_fdw then passes the received state value to the executor of the local server.

2. The approach of adding SQL keywords
> From: Robert Haas <robertmhaas(at)gmail(dot)com>
> Sent: Tuesday, November 21, 2023 5:52 AM
> I do have a concern about this, though. It adds a lot of bloat. It adds a whole lot of additional entries to pg_aggregate, and
> every new aggregate we add in the future will require a bonus entry for this, and it needs a bunch of new pg_proc entries as
> well. One idea that I've had in the past is to instead introduce syntax that just does this, without requiring a separate
> aggregate definition in each case.
> For example, maybe instead of changing string_agg(whatever) to string_agg_p_text_text(whatever), you can say
> PARTIAL_AGGREGATE
> string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or something. Then all aggregates could be treated
> in a generic way. I'm not completely sure that's better, but I think it's worth considering.
I have prototyped an approach using SQL keywords for the patch that does not include the functionality of Step 1. Please find the prototype attached as a file.
# I apologize for not including sufficient comments, documentation, and tests in the prototype. Please understand.
Mainly, it seems that we can address this by adding handling for the new SQL keywords in the parser and making modifications to the finalize process for aggregation in the executor.
As pointed out by Mr.Haas, it has been realized that the code can be significantly simplified.
The additional lines of code, excluding documentation and tests, are as follows.
Adding new aggregate functions approach(approach #1): 1,069
Adding new SQL keyword approach(approach #2): 318
As mentioned in 1., I plan to modify the patch by adding SQL keywords in the future.

3. Fixing the behavior when the HAVING clause is present
> From: Robert Haas <robertmhaas(at)gmail(dot)com>
> Sent: Tuesday, November 28, 2023 4:08 AM
>
> On Wed, Nov 22, 2023 at 1:32 AM Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> wrote:
> > Hi. HAVING is also a problem. Consider the following query
> >
> > SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
> > foreign server as HAVING needs full aggregate result, but foreign
> > server don't know it.
>
> I don't see it that way. What we would push to the foreign server would be something like SELECT count(a) FROM t. Then,
> after we get the results back and combine the various partial counts locally, we would locally evaluate the HAVING clause
> afterward. That is, partial aggregation is a barrier to pushing down HAVING clause itself, but it doesn't preclude pushing
> down the aggregation.
I have made modifications in the attached patch to ensure that when the HAVING clause is present, the HAVING clause is executed locally while the partial aggregations are pushed down.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Attachment Content-Type Size
0001-Partial-aggregates-push-down-SQL-keyword.patch application/octet-stream 143.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2024-02-22 07:22:06 Re: Add lookup table for replication slot invalidation causes
Previous Message Laurenz Albe 2024-02-22 07:16:09 Re: Speeding up COPY TO for uuids and arrays