Re: Parallel Append implementation

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-03-10 13:12:17
Message-ID: CAJ3gD9f5kTa_pQUCisunMbQi8MZbn_oPv+G-_te57+e3o+8UDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

After giving more thought to our discussions, I have have used the
Bitmapset structure in AppendPath as against having two lists one for
partial and other for non-partial paths. Attached is the patch v6 that
has the required changes. So accumulate_append_subpath() now also
prepares the bitmapset containing the information about which paths
are partial paths. This is what I had done in the first version.

At this point of time, I have not given sufficient time to think about
Ashutosh's proposal of just keeping track of the next_subplan which he
mentioned. There, we just keep assigning workers to a circle of
subplans in round-robin style. But I think as of now the approach of
choosing the minimum worker subplan is pretty simple looking. So the
patch v6 is in a working condition using minimum-worker approach.

On 9 March 2017 at 07:22, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Some review:
>
> +typedef struct ParallelAppendDescData
> +{
> + slock_t pa_mutex; /* mutual exclusion to choose
> next subplan */
> + ParallelAppendInfo pa_info[FLEXIBLE_ARRAY_MEMBER];
> +} ParallelAppendDescData;
>
> Instead of having ParallelAppendInfo, how about just int
> pa_workers[FLEXIBLE_ARRAY_MEMBER]? The second structure seems like
> overkill, at least for now.

I have , for now, kept the structure there, just in case after further
discussion we may add something.

>
> +static inline void
> +exec_append_scan_first(AppendState *appendstate)
> +{
> + appendstate->as_whichplan = 0;
> +}
>
> I don't think this is buying you anything, and suggest backing it out.

This is required for sequential Append, so that we can start executing
from the first subplan.

>
> + /* Backward scan is not supported by parallel-aware plans */
> + Assert(!ScanDirectionIsBackward(appendstate->ps.state->es_direction));
>
> I think you could assert ScanDirectionIsForward, couldn't you?
> NoMovement, I assume, is right out.

Right. Changed.

>
> + elog(DEBUG2, "ParallelAppend : pid %d : all plans already
> finished",
> + MyProcPid);
>
> Please remove (and all similar cases also).

Removed at multiple places.

>
> + sizeof(*node->as_padesc->pa_info) * node->as_nplans);
>
> I'd use the type name instead.

Done.

>
> + for (i = 0; i < node->as_nplans; i++)
> + {
> + /*
> + * Just setting all the number of workers to 0 is enough. The logic
> + * of choosing the next plan in workers will take care of everything
> + * else.
> + */
> + padesc->pa_info[i].pa_num_workers = 0;
> + }
>
> Here I'd use memset.

Done.

>
> + return (min_whichplan == PA_INVALID_PLAN ? false : true);
>
> Maybe just return (min_whichplan != PA_INVALID_PLAN);

Done.

>
> - childrel->cheapest_total_path);
> +
> childrel->cheapest_total_path);
>
> Unnecessary.

This call is now having more param, so kept the change.
>
> + {
> partial_subpaths = accumulate_append_subpath(partial_subpaths,
> linitial(childrel->partial_pathlist));
> + }
>
> Don't need to add braces.

Removed them.

>
> + /*
> + * Extract the first unparameterized, parallel-safe one among the
> + * child paths.
> + */
>
> Can we use get_cheapest_parallel_safe_total_inner for this, from
> a71f10189dc10a2fe422158a2c9409e0f77c6b9e?

Yes, Fixed.

>
> + if (rel->partial_pathlist != NIL &&
> + (Path *) linitial(rel->partial_pathlist) == subpath)
> + partial_subplans_set = bms_add_member(partial_subplans_set, i);
>
> This seems like a scary way to figure this out. What if we wanted to
> build a parallel append subpath with some path other than the
> cheapest, for some reason? I think you ought to record the decision
> that set_append_rel_pathlist makes about whether to use a partial path
> or a parallel-safe path, and then just copy it over here.

As mentioned above, used Bitmapset in AppendPath.

>
> - create_append_path(grouped_rel,
> - paths,
> - NULL,
> - 0);
> + create_append_path(grouped_rel, paths, NULL, 0);
>
> Unnecessary.

Now since there was anyway a change in the number of params, I kept
the single line call.

Please refer to attached patch version v6 for all of the above changes.

Attachment Content-Type Size
ParallelAppend_v6.patch application/octet-stream 40.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-03-10 13:12:51 Re: PATCH: Configurable file mode mask
Previous Message Alexander Korotkov 2017-03-10 13:08:18 Re: Should we cacheline align PGXACT?