Re: Parallel Append implementation

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Tels <nospam-pg-abuse(at)bloodgate(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-03-13 09:04:09
Message-ID: CAJ3gD9eRn6MJkJYUmAbvm62b1fVyUARhv9EiXiAjVRqBvHuKFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12 March 2017 at 19:31, Tels <nospam-pg-abuse(at)bloodgate(dot)com> wrote:
> Moin,
>
> On Sat, March 11, 2017 11:29 pm, Robert Haas wrote:
>> On Fri, Mar 10, 2017 at 6:01 AM, Tels <nospam-pg-abuse(at)bloodgate(dot)com>
>> wrote:
>>> Just a question for me to understand the implementation details vs. the
>>> strategy:
>>>
>>> Have you considered how the scheduling decision might impact performance
>>> due to "inter-plan parallelism vs. in-plan parallelism"?
>>>
>>> So what would be the scheduling strategy? And should there be a fixed
>>> one
>>> or user-influencable? And what could be good ones?
>>>
>>> A simple example:
>>>
>>> E.g. if we have 5 subplans, and each can have at most 5 workers and we
>>> have 5 workers overall.
>>>
>>> So, do we:
>>>
>>> Assign 5 workers to plan 1. Let it finish.
>>> Then assign 5 workers to plan 2. Let it finish.
>>> and so on
>>>
>>> or:
>>>
>>> Assign 1 workers to each plan until no workers are left?
>>
>> Currently, we do the first of those, but I'm pretty sure the second is
>> way better. For example, suppose each subplan has a startup cost. If
>> you have all the workers pile on each plan in turn, every worker pays
>> the startup cost for every subplan. If you spread them out, then
>> subplans can get finished without being visited by all workers, and
>> then the other workers never pay those costs. Moreover, you reduce
>> contention for spinlocks, condition variables, etc. It's not
>> impossible to imagine a scenario where having all workers pile on one
>> subplan at a time works out better: for example, suppose you have a
>> table with lots of partitions all of which are on the same disk, and
>> it's actually one physical spinning disk, not an SSD or a disk array
>> or anything, and the query is completely I/O-bound. Well, it could
>> be, in that scenario, that spreading out the workers is going to turn
>> sequential I/O into random I/O and that might be terrible. In most
>> cases, though, I think you're going to be better off. If the
>> partitions are on different spindles or if there's some slack I/O
>> capacity for prefetching, you're going to come out ahead, maybe way
>> ahead. If you come out behind, then you're evidently totally I/O
>> bound and have no capacity for I/O parallelism; in that scenario, you
>> should probably just turn parallel query off altogether, because
>> you're not going to benefit from it.
>
> I agree with the proposition that both strategies can work well, or not,
> depending on system-setup, the tables and data layout. I'd be a bit more
> worried about turning it into the "random-io-case", but that's still just
> a feeling and guesswork.
>
> So which one will be better seems speculative, hence the question for
> benchmarking different strategies.
>
> So, I'd like to see the scheduler be out in a single place, maybe a
> function that get's called with the number of currently running workers,
> the max. number of workers to be expected, the new worker, the list of
> plans still todo, and then schedules that single worker to one of these
> plans by strategy X.
>
> That would make it easier to swap out X for Y and see how it fares,
> wouldn't it?

Yes, actually pretty much the scheduler logic is all in one single
function parallel_append_next().

>
>
> However, I don't think the patch needs to select the optimal strategy
> right from the beginning (if that even exists, maybe it's a mixed
> strategy), even "not so optimal" parallelism will be better than doing all
> things sequentially.
>
> Best regards,
>
> Tels

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?= 2017-03-13 09:35:39 Re: make check-world output
Previous Message Amit Khandekar 2017-03-13 08:59:47 Re: Parallel Append implementation