Re: Parallel Append implementation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-03-09 01:52:32
Message-ID: CA+TgmoaJ3+KYFusQj6cqcsi_jSTt6V=3PKCit=c=t4nsZty0gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 8, 2017 at 2:00 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> Yeah, that seems to be right in most of the cases. The only cases
> where your formula seems to give too few workers is for something like
> : (2, 8, 8). For such subplans, we should at least allocate 8 workers.
> It turns out that in most of the cases in my formula, the Append
> workers allocated is just 1 worker more than the max per-subplan
> worker count. So in (2, 1, 1, 8), it will be a fraction more than 8.
> So in the patch, in addition to the log2() formula you proposed, I
> have made sure that it allocates at least equal to max(per-subplan
> parallel_workers values).

Yeah, I agree with that.

Some review:

+typedef struct ParallelAppendDescData
+{
+ slock_t pa_mutex; /* mutual exclusion to choose
next subplan */
+ ParallelAppendInfo pa_info[FLEXIBLE_ARRAY_MEMBER];
+} ParallelAppendDescData;

Instead of having ParallelAppendInfo, how about just int
pa_workers[FLEXIBLE_ARRAY_MEMBER]? The second structure seems like
overkill, at least for now.

+static inline void
+exec_append_scan_first(AppendState *appendstate)
+{
+ appendstate->as_whichplan = 0;
+}

I don't think this is buying you anything, and suggest backing it out.

+ /* Backward scan is not supported by parallel-aware plans */
+ Assert(!ScanDirectionIsBackward(appendstate->ps.state->es_direction));

I think you could assert ScanDirectionIsForward, couldn't you?
NoMovement, I assume, is right out.

+ elog(DEBUG2, "ParallelAppend : pid %d : all plans already
finished",
+ MyProcPid);

Please remove (and all similar cases also).

+ sizeof(*node->as_padesc->pa_info) * node->as_nplans);

I'd use the type name instead.

+ for (i = 0; i < node->as_nplans; i++)
+ {
+ /*
+ * Just setting all the number of workers to 0 is enough. The logic
+ * of choosing the next plan in workers will take care of everything
+ * else.
+ */
+ padesc->pa_info[i].pa_num_workers = 0;
+ }

Here I'd use memset.

+ return (min_whichplan == PA_INVALID_PLAN ? false : true);

Maybe just return (min_whichplan != PA_INVALID_PLAN);

- childrel->cheapest_total_path);
+
childrel->cheapest_total_path);

Unnecessary.

+ {
partial_subpaths = accumulate_append_subpath(partial_subpaths,
linitial(childrel->partial_pathlist));
+ }

Don't need to add braces.

+ /*
+ * Extract the first unparameterized, parallel-safe one among the
+ * child paths.
+ */

Can we use get_cheapest_parallel_safe_total_inner for this, from
a71f10189dc10a2fe422158a2c9409e0f77c6b9e?

+ if (rel->partial_pathlist != NIL &&
+ (Path *) linitial(rel->partial_pathlist) == subpath)
+ partial_subplans_set = bms_add_member(partial_subplans_set, i);

This seems like a scary way to figure this out. What if we wanted to
build a parallel append subpath with some path other than the
cheapest, for some reason? I think you ought to record the decision
that set_append_rel_pathlist makes about whether to use a partial path
or a parallel-safe path, and then just copy it over here.

- create_append_path(grouped_rel,
- paths,
- NULL,
- 0);
+ create_append_path(grouped_rel, paths, NULL, 0);

Unnecessary.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-09 01:55:15 Re: Cost model for parallel CREATE INDEX
Previous Message Peter Geoghegan 2017-03-09 01:45:33 Re: Cost model for parallel CREATE INDEX