Re: A reloption for partitioned tables - parallel_workers

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Seamus Abshere <seamus(at)abshere(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A reloption for partitioned tables - parallel_workers
Date: 2021-03-03 16:20:19
Message-ID: aff8a6fd5ef4738afb1fe8dc9b7a9018feaad4c5.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2021-03-03 at 17:58 +0900, Amit Langote wrote:
> On Tue, Mar 2, 2021 at 5:47 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > On Tue, 2021-03-02 at 11:23 +0900, Amit Langote wrote:
> > > I got the same result with my implementation, but I am wondering if
> > > setting parallel_workers=0 on the parent table shouldn't really
> > > disable a regular (non-parallel-aware) Append running under Gather
> > > even if it does Parallel Append (parallel-aware)? So in this test
> > > case, there should have been a Gather atop Append, with individual
> > > partitions scanned using Parallel Seq Scan where applicable.
> >
> > I am not sure, but I tend to think that if you specify no
> > parallel workers, you want no parallel workers.
>
> I am thinking that one would set parallel_workers on a parent
> partitioned table to control only how many workers a Parallel Append
> can spread across partitions or use parallel_workers=0 to disable this
> form of partition parallelism. However, one may still want the
> individual partitions to be scanned in parallel, where workers only
> spread across the partition's blocks. IMO, we should try to keep
> those two forms of parallelism separately configurable.

I see your point.

I thought that PostgreSQL might consider such a plan anyway, but
I am not deep enough into the partitioning code to know.

Thinking this further, wouldn't that mean that we get into a
conflict if someone sets "parallel_workers" on both a partition and
the partitioned table? Which setting should win?

> > SET enable_partitionwise_aggregate = on;
> >
> > EXPLAIN (COSTS OFF)
> > SELECT count(*) FROM pagg_tab_ml;
> > QUERY PLAN
> > ------------------------------------------------------------------------------
> > Finalize Aggregate
> > -> Gather
> > Workers Planned: 4
> > -> Parallel Append
> > -> Partial Aggregate
> > -> Parallel Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
> > -> Partial Aggregate
> > -> Parallel Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_3
> > -> Partial Aggregate
> > -> Parallel Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_1
> > -> Partial Aggregate
> > -> Parallel Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_4
> > -> Partial Aggregate
> > -> Parallel Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_2
> > (14 rows)
> >
> > The default number of parallel workers is taken, because the append is
> > on an upper relation, not the partitioned table itself.
> >
> > One would wish that "parallel_workers" somehow percolated up,
>
> It appears that we don't set the fields of an upper relation such that
> IS_PARTITIONED_REL() would return true for it, like we do for base and
> join relations. In compute_append_parallel_workers(), we're requiring
> it to be true to even look at the relation's rel_parallel_workers. We
> can set those properties in *some* grouping rels, for example, when
> the aggregation is grouped on the input relation's partition key.
> That would make it possible for the Append on such grouping relations
> to refer to their input partitioned relation's rel_parallel_workers.
> For example, with the attached PoC patch:
>
> SET parallel_setup_cost TO 0;
> SET max_parallel_workers_per_gather TO 8;
> SET enable_partitionwise_aggregate = on;
>
> alter table pagg_tab_ml set (parallel_workers=5);
>
> EXPLAIN (COSTS OFF) SELECT a, count(*) FROM pagg_tab_ml GROUP BY 1;
> QUERY PLAN
> ---------------------------------------------------------------------
> Gather
> Workers Planned: 5
> -> Parallel Append
> -> HashAggregate
> Group Key: pagg_tab_ml_5.a
> -> Append
> -> Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_5
> -> Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_6
> -> HashAggregate
> Group Key: pagg_tab_ml.a
> -> Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
> -> HashAggregate
> Group Key: pagg_tab_ml_2.a
> -> Append
> -> Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_2
> -> Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_3
> (16 rows)
>
> alter table pagg_tab_ml set (parallel_workers=0);
>
> EXPLAIN (COSTS OFF) SELECT a, count(*) FROM pagg_tab_ml GROUP BY 1;
> QUERY PLAN
> ------------------------------------------------------------------------------------------
> Append
> -> Finalize GroupAggregate
> Group Key: pagg_tab_ml.a
> -> Gather Merge
> Workers Planned: 1
> -> Sort
> Sort Key: pagg_tab_ml.a
> -> Partial HashAggregate
> Group Key: pagg_tab_ml.a
> -> Parallel Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
> -> Finalize GroupAggregate
> Group Key: pagg_tab_ml_2.a
> -> Gather Merge
> Workers Planned: 2
> -> Sort
> Sort Key: pagg_tab_ml_2.a
> -> Parallel Append
> -> Partial HashAggregate
> Group Key: pagg_tab_ml_2.a
> -> Parallel Seq Scan on
> pagg_tab_ml_p2_s1 pagg_tab_ml_2
> -> Partial HashAggregate
> Group Key: pagg_tab_ml_3.a
> -> Parallel Seq Scan on
> pagg_tab_ml_p2_s2 pagg_tab_ml_3
> -> Finalize GroupAggregate
> Group Key: pagg_tab_ml_5.a
> -> Gather Merge
> Workers Planned: 2
> -> Sort
> Sort Key: pagg_tab_ml_5.a
> -> Parallel Append
> -> Partial HashAggregate
> Group Key: pagg_tab_ml_5.a
> -> Parallel Seq Scan on
> pagg_tab_ml_p3_s1 pagg_tab_ml_5
> -> Partial HashAggregate
> Group Key: pagg_tab_ml_6.a
> -> Parallel Seq Scan on
> pagg_tab_ml_p3_s2 pagg_tab_ml_6
> (36 rows)
>
> alter table pagg_tab_ml set (parallel_workers=9);
>
> EXPLAIN (COSTS OFF) SELECT a, count(*) FROM pagg_tab_ml GROUP BY 1;
> QUERY PLAN
> ---------------------------------------------------------------------
> Gather
> Workers Planned: 8
> -> Parallel Append
> -> HashAggregate
> Group Key: pagg_tab_ml_5.a
> -> Append
> -> Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_5
> -> Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_6
> -> HashAggregate
> Group Key: pagg_tab_ml.a
> -> Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
> -> HashAggregate
> Group Key: pagg_tab_ml_2.a
> -> Append
> -> Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_2
> -> Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_3
> (16 rows)

That looks good!

One could imagine similar behavior for partitionwise joins, but
it might be difficult to decide which side should determine the
number of parallel workers.

I think that with this addition, this patch would make a useful improvement.

Yours,
Laurenz Albe

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-03 16:23:54 Re: Shared memory size computation oversight?
Previous Message David Steele 2021-03-03 16:14:20 Re: Make mesage at end-of-recovery less scary.