Quick Links

Re: [HACKERS] Partition-wise aggregation/grouping

From:	Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: [HACKERS] Partition-wise aggregation/grouping
Date:	2018-01-02 07:30:34
Message-ID:	CAM2+6=VuerUdPdMX0OCXbSXxiJS7SY-T_963RqOiLF=w_MpJ=A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Dec 12, 2017 at 3:43 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> Here are review comments for 0009
> Only full aggregation is pushed on the remote server.
>
> I think we can live with that for a while but we need to be able to push
> down
> partial aggregates to the foreign server. I agree that it needs some
> infrastructure to serialized and deserialize the partial aggregate values,
> support unpartitioned aggregation first and then work on partitioned
> aggregates. That is definitely a separate piece of work.
>

Yep.

>
> +CREATE TABLE pagg_tab_p3 (a int, b int, c text);
>
> Like partition-wise join testcases please use LIKE so that it's easy to
> change
> the table schema if required.
>

Done.

>
> +INSERT INTO pagg_tab_p1 SELECT i % 30, i % 50, to_char(i/30,
> 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 10;
> +INSERT INTO pagg_tab_p2 SELECT i % 30, i % 50, to_char(i/30,
> 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 20 and (i %
> 30) >= 10;
> +INSERT INTO pagg_tab_p3 SELECT i % 30, i % 50, to_char(i/30,
> 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 30 and (i %
> 30) >= 20;
>
> We have to do this because INSERT tuple routing to a foreign partition is
> not
> supported right now.

Yes.

> Somebody has to remember to change this to a single
> statement once that's done.
>

I don't know how we can keep track of it.

>
> +ANALYZE fpagg_tab_p1;
> +ANALYZE fpagg_tab_p2;
> +ANALYZE fpagg_tab_p3;
>
> I thought this is not needed. When you ANALYZE the partitioned table, it
> would
> analyze the partitions as well. But I see that partition-wise join is also
> ANALYZING the foreign partitions separately. When I ran ANALYZE on a
> partitioned table with foreign partitions, statistics for only the local
> tables
> (partitioned and partitions) was updated. Of course this is separate work,
> but
> probably needs to be fixed.
>

Hmm.

>
> +-- When GROUP BY clause matches with PARTITION KEY.
> +-- Plan when partition-wise-agg is disabled
>
> s/when/with/
>
> +-- Plan when partition-wise-agg is enabled
>
> s/when/with/
>

Done.

>
> + -> Append
>
> Just like ForeignScan node's Relations tell what kind of ForeignScan this
> is,
> may be we should annotate Append to tell whether the children are joins,
> aggregates or relation scans. That might be helpful. Of course as another
> patch.
>

OK.

>
> +SELECT a, sum(b), min(b), count(*) FROM pagg_tab GROUP BY a HAVING
> avg(b) < 25 ORDER BY 1;
> + a | sum | min | count
> +----+------+-----+-------
> + 0 | 2000 | 0 | 100
> + 1 | 2100 | 1 | 100
> [ ... clipped ...]
> + 23 | 2300 | 3 | 100
> + 24 | 2400 | 4 | 100
> +(15 rows)
>
> May be we want to reduce the number of rows to a few by using a stricter
> HAVING
> clause?
>

Done.

>
> +
> +-- When GROUP BY clause not matches with PARTITION KEY.
>
> ... clause does not match ...
>

Done.

>
> +EXPLAIN (VERBOSE, COSTS OFF)
> +SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING
> sum(a) < 800 ORDER BY 1;
> +
> QUERY PLAN
> +-----------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------
> + Sort
> + Output: fpagg_tab_p1.b, (avg(fpagg_tab_p1.a)),
> (max(fpagg_tab_p1.a)), (count(*))
> + -> Partial HashAggregate
> [ ... clipped ... ]
> + Output: fpagg_tab_p3.b, PARTIAL
> avg(fpagg_tab_p3.a), PARTIAL max(fpagg_tab_p3.a), PARTIAL count(*),
> PARTIAL sum(fpagg_tab_p3.a)
> + Group Key: fpagg_tab_p3.b
> + -> Foreign Scan on public.fpagg_tab_p3
> + Output: fpagg_tab_p3.b, fpagg_tab_p3.a
> + Remote SQL: SELECT a, b FROM public.pagg_tab_p3
> +(26 rows)
>
> I think we interested in overall shape of the plan and not the details of
> Remote SQL etc. So, may be turn off VERBOSE. This comment applies to an
> earlier
> plan with enable_partition_wise_agg = false;
>

OK. Removed VERBOSE from all the queries as we are interested in overall
shape of the plan.

>
> +
> +SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING
> sum(a) < 800 ORDER BY 1;
> + b | avg | max | count
> +----+---------------------+-----+-------
> + 0 | 10.0000000000000000 | 20 | 60
> + 1 | 11.0000000000000000 | 21 | 60
> [... clipped ...]
> + 42 | 12.0000000000000000 | 22 | 60
> + 43 | 13.0000000000000000 | 23 | 60
> +(20 rows)
>
> Since the aggregates were not pushed down, I doubt we should be testing the
> output. But this test is good to check partial aggregates over foreign
> partition scans, which we don't have in postgres_fdw.sql I think. So, may
> be
> add it as a separate patch?
>

Agree. Removed SELECT query. EXPLAIN is enough here.

>
> Can you please add a test where we reference a whole-row; that usually has
> troubles.
>

Added.

>
> - if (root->hasHavingQual && query->havingQual)
> + if (root->hasHavingQual && fpinfo->havingQual)
>
> This is not exactly a problem with your patch, but why do we need to check
> both
> the boolean and the actual clauses? If the boolean is true,
> query->havingQual
> should be non-NULL and NULL otherwise.
>

Agree. But since this is not fault of this patch, I have kept as is.

>
> /* Grouping information */
> List *grouped_tlist;
> + PathTarget *grouped_target;
> + Node *havingQual;
>
> I think we don't need havingQual as a separate member.
> foreign_grouping_ok()
> separates the clauses in havingQual into shippable and non-shippable
> clauses
> and saves in local_conditions and remote_conditions. Probably we want to
> use
> those instead of adding a new member.
>

local/remote_conditions is a list of RestrictInfos which cannot be used
while costing the aggregates. So we anyways need to store the havingQual.

>
> index 04e43cc..c8999f6 100644
> --- a/src/include/foreign/fdwapi.h
> +++ b/src/include/foreign/fdwapi.h
> @@ -62,7 +62,8 @@ typedef void (*GetForeignJoinPaths_function)
> (PlannerInfo *root,
> typedef void (*GetForeignUpperPaths_function) (PlannerInfo *root,
> UpperRelationKind stage,
> RelOptInfo *input_rel,
> - RelOptInfo *output_rel);
> + RelOptInfo *output_rel,
> + UpperPathExtraData *extra);
>
> Probably this comment belongs to 0007, but it's in this patch that it
> becomes
> clear how invasive UpperPathExtraData changes are. While
> UpperPathExtraData has
> upper paths in the name, all of its members are related to grouping. That's
> fine since we only support partition-wise aggregate and not the other upper
> operations. But if we were to do that in future, which of these members
> would
> be applicable to other upper relations? inputRows, pathTarget,
> partialPathTarget may be applicable to other upper rels as well. can_sort,
> can_hash may be applicable to DISTINCT, SORT relations. isPartial and
> havingQual will be applicable only to Grouping/Aggregation. So, may be
> it's ok,
> and like RelOptInfo we may separate them by comments.
>

I have grouped them like you said but not added any comments yet as I am
not sure at this point that which fields will be used by those other upper
rel kinds.
Please have a look.

>
> Another problem with that structure is its name doesn't mention that the
> structure is used only for child upper relations, whereas the code assumes
> that
> if extra is not present it's a parent upper relation. May be we want to
> rename
> it to that effect or always use it whether for a parent or a child
> relation.
>

Renamed to OtherUpperPathExtraData.

> We may want to rename pathTarget and partialPathTarget as relTarget and
> partialRelTarget since those targets are not specific to any path, but
> will be
> applicable to all the paths created for that rel.
>

Renamed.

These fixes are part of the v9 patchset.

Thanks
--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Re: [HACKERS] Partition-wise aggregation/grouping at 2017-12-12 10:13:11 from Ashutosh Bapat

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeevan Chalke	2018-01-02 08:13:32	Re: [HACKERS] Partition-wise aggregation/grouping
Previous Message	Jeevan Chalke	2018-01-02 07:06:36	Re: [HACKERS] Partition-wise aggregation/grouping