Re: Increasing parallel workers at runtime

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increasing parallel workers at runtime
Date: 2017-05-23 11:11:39
Message-ID: CAA4eK1Jkdc9rfMYT_cZka5-=cqQQRrsZQ__Ufu40hyzt=-EChQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 22, 2017 at 2:54 PM, Rafia Sabih
<rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> On Wed, May 17, 2017 at 2:57 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Tue, May 16, 2017 at 2:14 PM, Ashutosh Bapat
>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> On Mon, May 15, 2017 at 9:23 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>
>>> Also, looking at the patch, it doesn't look like it take enough care
>>> to build execution state of new worker so that it can participate in a
>>> running query. I may be wrong, but the execution state initialization
>>> routines are written with the assumption that all the workers start
>>> simultaneously?
>>>
>>
>> No such assumptions, workers started later can also join the execution
>> of the query.
>>
> If we are talking of run-time allocation of workers I'd like to
> propose an idea to safeguard parallelism from selectivity-estimation
> errors. Start each query (if it qualifies for the use of parallelism)
> with a minimum number of workers (say 2) irrespective of the #planned
> workers. Then as query proceeds and we find that there is more work to
> do, we allocate more workers.
>
> Let's get to the details a little, we'll have following new variables,
> - T_int - a time interval at which we'll periodically check if the
> query requires more workers,
> - work_remaining - a variable which estimates the work yet to do. This
> will use the selectivity estimates to find the total work done and the
> remaining work accordingly. Once, the actual number of rows crosses
> the estimated number of rows, take maximum possible tuples for that
> operator as the new estimate.
>
> Now, we'll check at gather, after each T_int if the work is remaining
> and allocate another 2 (say) workers. This way we'll keep on adding
> the workers in small chunks and not in one go. Thus, saving resources
> in case over-estimation is done.
>
> Some of the things we may improvise upon are,
> - check if we want to increase workers or kill some of them. e.g. if
> the filtering is not happening at estimated at the node, i.e. #output
> tuples is same as #input tuples, then do not add any more workers as
> it will increase the work at gather only.
> - Instead of just having a number of 2 or 4 workers at the start,
> allocate x% of planned workers.
> - As the query progresses, we may alter the value of T_int and/or
> #workers to allocate. e.g. till a query is done something less than
> 50%, check at every T_int, after that increase T_int to T_int(1 + .5),
> similarly for #workers, because now allocating more workers might not
> do much work rather the cost of adding new workers could be more.
>
> This scheme is likely to safeguard parallelism with selectivity
> estimation errors in a sense that it is using resources only when
> required.

Isn't this point contradictory? Basically, on one side you are
suggesting to calculate additional workers (work_remaining) based on
selectivity and on another side you are saying that it will fix
estimation errors. IIUC, then you are talking about some sort of
executor feedback to adjust a number of workers which might be a good
thing to do but I that is a different problem to solve. As of now, I
don't think we have that type of mechanism even for non-parallel
execution.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-05-23 11:18:47 Re: Error-like LOG when connecting with SSL for password authentication
Previous Message Mahi Gurram 2017-05-23 11:06:19 Re: Regarding B-Tree Lookup