Re: Partial aggregates pushdown

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Ilya Gladyshev <i(dot)gladyshev(at)postgrespro(dot)ru>
Subject: Re: Partial aggregates pushdown
Date: 2023-09-26 13:15:14
Message-ID: 37c9d93c2dca786f534639136d1da8f3@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp писал 2023-09-25 06:18:
> Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.
>
> Thank you for your valuable comments. I sincerely apologize for the
> very late reply.
> Here is a response to your comments or a fix to the patch.
>
> Tuesday, August 8, 2023 at 3:31 Bruce Momjian
>> > I have modified the program except for the point "if the version of the remote server is less than PG17".
>> > Instead, we have addressed the following.
>> > "If check_partial_aggregate_support is true and the remote server version is older than the local server
>> > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
>> > the partial aggregate function and the aggregate function match."
>> > The reason for this is to maintain compatibility with any aggregate function that does not support partial
>> > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
>> > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
>> > in PG16.
>>
>> Just to clarify, I think you are saying:
>>
>> If check_partial_aggregate_support is true and the remote
>> server
>> version is older than the local server version, postgres_fdw
>> checks if the partial aggregate function exists on the remote
>> server during planning and only uses it if it does.
>>
>> I tried to phrase it in a positive way, and mentioned the plan time
>> distinction. Also, I am sorry I was away for most of July and am just
>> getting to this.
> Thanks for your comment. In the documentation, the description of
> check_partial_aggregate_support is as follows
> (please see postgres-fdw.sgml).
> --
> check_partial_aggregate_support (boolean)
> Only if this option is true, during query planning, postgres_fdw
> connects to the remote server and check if the remote server version
> is older than the local server version. If so, postgres_fdw assumes
> that for each built-in aggregate function, the partial aggregate
> function is not defined on the remote server unless the partial
> aggregate function and the aggregate function match. The default is
> false.
> --
>
> Thursday, 20 July 2023 19:23 Alexander Pyhalov
> <a(dot)pyhalov(at)postgrespro(dot)ru>:
>> Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp писал 2023-07-19 03:43:
>> > Hi Mr.Pyhalov, hackers.
>>
>> > 3)
>> > I modified the patch to safely do a partial aggregate pushdown for
>> > queries which contain having clauses.
>> >
>>
>> Hi.
>> Sorry, but I don't see how it could work.
> We apologize for any inconvenience caused.
> Thanks to Pyhalov's and Jim's comments, I have realized that I have
> made a fundamental mistake regarding the pushdown of the HAVING clause
> and the difficulty of achieving it performing Partial aggregate
> pushdown.
> So, I removed the codes about pushdown of the HAVING clause performing
> Partial aggregate pushdown.
>
> Thursday, 20 July 2023 19:23 Alexander Pyhalov
> <a(dot)pyhalov(at)postgrespro(dot)ru>:
>> As for changes in planner.c (setGroupClausePartial()) I have several
>> questions.
>>
>> 1) Why don't we add non_group_exprs to pathtarget->exprs when
>> partial_target->exprs is not set?
>>
>> 2) We replace extra->partial_target->exprs with partial_target->exprs
>> after processing. Why are we sure that after this tleSortGroupRef is
>> correct?
> Response to 1)
> The code you pointed out was unnecessary. I have removed this code.
> Also, the process of adding PlaceHolderVar's expr to partial_target was
> missing.
> So I fixed this.
>
> Response to 2)
> The making procedures extra->groupClausePartial and
> extra->partial_target
> in make_partial_grouping_target for this patch is as follows.
> STEP1. From grouping_target->exprs, extract Aggref, Var and
> Placeholdervar that are not included in Aggref.
> STEP2. setGroupClausePartial sets the copy of original groupClause to
> extra->groupClausePartial
> and sets the copy of original partial_target to extra->partial_target.
> STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to
> partial_target.
> The sortgroupref of partial_target->sortgrouprefs to be added to value
> is set to
> (the maximum value of the existing sortgroupref) + 1.
> setGroupClausePartial adds data sgc of sortgroupclause type where
> sgc->tlesortgroupref
> matches the sortgroupref to GroupClause.
> STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to
> partial_target.
>
> Due to STEP2, the list of tlesortgrouprefs set in
> extra->groupClausePartial is not duplicated.

Do you mean that extra->partial_target->sortgrouprefs is not replaced,
and so we preserve tlesortgroupref numbers?
I'm suspicious about rewriting extra->partial_target->exprs with
partial_target->exprs - I'm still not sure why we
don't we loose information, added by add_column_to_pathtarget() to
extra->partial_target->exprs?

Also look at the following example.

EXPLAIN VERBOSE SELECT count(*) , (b/2)::numeric FROM pagg_tab GROUP BY
b/2 ORDER BY 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Sort (cost=511.35..511.47 rows=50 width=44)
Output: (count(*)), ((((pagg_tab.b / 2)))::numeric), ((pagg_tab.b /
2))
Sort Key: (count(*))
-> Finalize HashAggregate (cost=509.06..509.94 rows=50 width=44)
Output: count(*), (((pagg_tab.b / 2)))::numeric, ((pagg_tab.b /
2))
Group Key: ((pagg_tab.b / 2))
-> Append (cost=114.62..506.06 rows=600 width=16)
-> Foreign Scan (cost=114.62..167.69 rows=200 width=16)
Output: ((pagg_tab.b / 2)), (PARTIAL count(*)),
pagg_tab.b
Relations: Aggregate on (public.fpagg_tab_p1
pagg_tab)
Remote SQL: SELECT (b / 2), count(*), b FROM
public.pagg_tab_p1 GROUP BY 1, 2
-> Foreign Scan (cost=114.62..167.69 rows=200 width=16)
Output: ((pagg_tab_1.b / 2)), (PARTIAL count(*)),
pagg_tab_1.b
Relations: Aggregate on (public.fpagg_tab_p2
pagg_tab_1)
Remote SQL: SELECT (b / 2), count(*), b FROM
public.pagg_tab_p2 GROUP BY 1, 2
-> Foreign Scan (cost=114.62..167.69 rows=200 width=16)
Output: ((pagg_tab_2.b / 2)), (PARTIAL count(*)),
pagg_tab_2.b
Relations: Aggregate on (public.fpagg_tab_p3
pagg_tab_2)
Remote SQL: SELECT (b / 2), count(*), b FROM
public.pagg_tab_p3 GROUP BY 1, 2

Note that group by is still deparsed incorrectly.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-09-26 13:19:54 Re: [PATCH] Add inline comments to the pg_hba_file_rules view
Previous Message Amit Langote 2023-09-26 13:06:12 Re: generic plans and "initial" pruning