From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com>, Seamus Abshere <seamus(at)abshere(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: A reloption for partitioned tables - parallel_workers |
Date: | 2021-02-20 03:54:59 |
Message-ID: | 5c53892162e8af3b47fa8a3f4b038c888dae65bb.camel@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 2021-02-19 at 16:30 +0900, Amit Langote wrote:
> On Tue, Feb 16, 2021 at 1:35 AM Seamus Abshere <seamus(at)abshere(dot)net> wrote:
> > > Here we go, my first patch... solves https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com
>
> Here is an updated version of the Seamus' patch that takes into
> account these and other comments received on this thread so far.
> Maybe warrants adding some tests too but I haven't.
Yes, there should be regression tests.
I gave the patch a spin, and it allows to raise the number of workers for
a parallel append as advertised.
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1337,8 +1337,9 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
If a table parameter value is set and the
equivalent <literal>toast.</literal> parameter is not, the TOAST table
will use the table's parameter value.
- Specifying these parameters for partitioned tables is not supported,
- but you may specify them for individual leaf partitions.
+ Specifying most of these parameters for partitioned tables is not
+ supported, but you may specify them for individual leaf partitions;
+ refer to the description of individual parameters for more details.
</para>
This doesn't make me happy. Since the options themselves do not say if they
are supported on partitioned tables or not, the reader is left in the dark.
Perhaps:
These options, with the exception of <literal>parallel_workers</literal>,
are not supported on partitioned tables, but you may specify them for individual
leaf partitions.
@@ -1401,9 +1402,12 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<para>
This sets the number of workers that should be used to assist a parallel
scan of this table. If not set, the system will determine a value based
- on the relation size. The actual number of workers chosen by the planner
- or by utility statements that use parallel scans may be less, for example
- due to the setting of <xref linkend="guc-max-worker-processes"/>.
+ on the relation size. When set on a partitioned table, the specified
+ number of workers will work on distinct partitions, so the number of
+ partitions affected by the parallel operation should be taken into
+ account. The actual number of workers chosen by the planner or by
+ utility statements that use parallel scans may be less, for example due
+ to the setting of <xref linkend="guc-max-worker-processes"/>.
</para>
</listitem>
</varlistentry>
The reader is left to believe that the default number of workers depends on the
size of the partitioned table, which is not entirely true.
Perhaps:
If not set, the system will determine a value based on the relation size and
the number of scanned partitions.
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1268,6 +1268,59 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
add_paths_to_append_rel(root, rel, live_childrels);
}
+/*
+ * compute_append_parallel_workers
+ * Computes the number of workers to assign to scan the subpaths appended
+ * by a given Append path
+ */
+static int
+compute_append_parallel_workers(RelOptInfo *rel, List *subpaths,
+ int num_live_children,
+ bool parallel_append)
The new function should have a prototype.
+{
+ ListCell *lc;
+ int parallel_workers = 0;
+
+ /*
+ * For partitioned rels, first see if there is a root-level setting for
+ * parallel_workers. But only consider if a Parallel Append plan is
+ * to be considered.
+ */
+ if (IS_PARTITIONED_REL(rel) && parallel_append)
+ parallel_workers =
+ compute_parallel_worker(rel, -1, -1,
+ max_parallel_workers_per_gather);
+
+ /* Find the highest number of workers requested for any subpath. */
+ foreach(lc, subpaths)
+ foreach(lc, subpaths)
+ {
+ Path *path = lfirst(lc);
+
+ parallel_workers = Max(parallel_workers, path->parallel_workers);
+ }
+ Assert(parallel_workers > 0 || subpaths == NIL);
+
+ /*
+ * If the use of parallel append is permitted, always request at least
+ * log2(# of children) workers. We assume it can be useful to have
+ * extra workers in this case because they will be spread out across
+ * the children. The precise formula is just a guess, but we don't
+ * want to end up with a radically different answer for a table with N
+ * partitions vs. an unpartitioned table with the same data, so the
+ * use of some kind of log-scaling here seems to make some sense.
+ */
+ if (parallel_append)
+ {
+ parallel_workers = Max(parallel_workers,
+ fls(num_live_children));
+ parallel_workers = Min(parallel_workers,
+ max_parallel_workers_per_gather);
+ }
+ Assert(parallel_workers > 0);
+
+ return parallel_workers;
+}
That means that it is not possible to *lower* the number of parallel workers
with this reloption, which seems to me a valid use case.
I think that if the option is set, it should override the number of workers
inherited from the partitions, and it should override the log2 default.
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-02-20 04:16:21 | Re: repeated decoding of prepared transactions |
Previous Message | Amit Kapila | 2021-02-20 03:38:03 | Re: repeated decoding of prepared transactions |