Re: Parallel Append implementation

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-03-10 06:03:00
Message-ID: CAFjFpRcEs43bQYSf1cbA2S0c6UsmxbDNPDemg2-5e39torxhJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> But as far as code is concerned, I think the two-list approach will
> turn out to be less simple if we derive corresponding two different
> arrays in AppendState node. Handling two different arrays during
> execution does not look clean. Whereas, the bitmapset that I have used
> in Append has turned out to be very simple. I just had to do the below
> check (and that is the only location) to see if it's a partial or
> non-partial subplan. There is nowhere else any special handling for
> non-partial subpath.
>
> /*
> * Increment worker count for the chosen node, if at all we found one.
> * For non-partial plans, set it to -1 instead, so that no other workers
> * run it.
> */
> if (min_whichplan != PA_INVALID_PLAN)
> {
> if (bms_is_member(min_whichplan,
> ((Append*)state->ps.plan)->partial_subplans_set))
> padesc->pa_info[min_whichplan].pa_num_workers++;
> else
> padesc->pa_info[min_whichplan].pa_num_workers = -1;
> }
>
> Now, since Bitmapset field is used during execution with such
> simplicity, why not have this same data structure in AppendPath, and
> re-use bitmapset field in Append plan node without making a copy of
> it. Otherwise, if we have two lists in AppendPath, and a bitmap in
> Append, again there is going to be code for data structure conversion.
>

I think there is some merit in separating out non-parallel and
parallel plans within the same array or outside it. The current logic
to assign plan to a worker looks at all the plans, unnecessarily
hopping over the un-parallel ones after they are given to a worker. If
we separate those two, we can keep assigning new workers to the
non-parallel plans first and then iterate over the parallel ones when
a worker needs a plan to execute. We might eliminate the need for
special value -1 for num workers. You may separate those two kinds in
two different arrays or within the same array and remember the
smallest index of a parallel plan.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2017-03-10 06:05:25 Re: PATCH: Configurable file mode mask
Previous Message Beena Emerson 2017-03-10 05:53:57 Re: increasing the default WAL segment size