Re: Parallel Append implementation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-02-14 17:05:36
Message-ID: CA+TgmobRmUWfkBM7B+qjpxr4f-SVFYexpfjgSbwJAbQhaAdW5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 6, 2017 at 12:36 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>> Now that I think of that, I think for implementing above, we need to
>> keep track of per-subplan max_workers in the Append path; and with
>> that, the bitmap will be redundant. Instead, it can be replaced with
>> max_workers. Let me check if it is easy to do that. We don't want to
>> have the bitmap if we are sure it would be replaced by some other data
>> structure.
>
> Attached is v2 patch, which implements above. Now Append plan node
> stores a list of per-subplan max worker count, rather than the
> Bitmapset. But still Bitmapset turned out to be necessary for
> AppendPath. More details are in the subsequent comments.

Keep in mind that, for a non-partial path, the cap of 1 worker for
that subplan is a hard limit. Anything more will break the world.
But for a partial plan, the limit -- whether 1 or otherwise -- is a
soft limit. It may not help much to route more workers to that node,
and conceivably it could even hurt, but it shouldn't yield any
incorrect result. I'm not sure it's a good idea to conflate those two
things. For example, suppose that I have a scan of two children, one
of which has parallel_workers of 4, and the other of which has
parallel_workers of 3. If I pick parallel_workers of 7 for the
Parallel Append, that's probably too high. Had those two tables been
a single unpartitioned table, I would have picked 4 or 5 workers, not
7. On the other hand, if I pick parallel_workers of 4 or 5 for the
Parallel Append, and I finish with the larger table first, I think I
might as well throw all 4 of those workers at the smaller table even
though it would normally have only used 3 workers. Having the extra
1-2 workers exist does not seem better.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-02-14 17:05:58 Re: Parallel Append implementation
Previous Message David Fetter 2017-02-14 16:54:01 Re: UPDATE of partition key