Re: Increasing parallel workers at runtime

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increasing parallel workers at runtime
Date: 2017-05-16 12:18:13
Message-ID: CAJrrPGeaTu_WmRdQ3_fXCaWrbuRC5yYVBEoVEUCRWPfPCyd-mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 16, 2017 at 1:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, May 15, 2017 at 10:06 AM, Haribabu Kommi
> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
> > This still needs some adjustments to fix for the cases where
> > the main backend also does the scan instead of waiting for
> > the workers to finish the job. As increasing the workers logic
> > shouldn't add an overhead in this case.
>
> I think it would be pretty crazy to try relaunching workers after
> every tuple, as this patch does. The overhead of that will be very
> high for queries where the number of tuples passing through the Gather
> is large, whereas when the number of tuples passing through Gather is
> small, or where tuples are sent all at once at the end of procesisng,
> it will not actually be very effective at getting hold of more
> workers.

In the current state of the patch, the main backend tries to start the
extra workers only when there is no tuples that are available from the
available workers. I feel that the invocation for more workers doesn't
do for every tuple.

1. When there are large number of tuples are getting transferred from
workers,
I feel there is very less chance that backend is free that it can start
more workers
as because the backend itself may not need to execute the plan locally.

2. When there are tuples that are transferred at the end of the plan for
the cases
something it involves a sort node or has aggregate or etc, either the
backend is
waiting for the tuples to arrive or by itself doing the plan execution
along with
workers after trying of extending the number workers once.

3. When there are small number of tuples that are getting transferred, in
this
case there are chance of extra workers invocation more time compare to the
other scenarios, but still in this case also, The less number of tuples
transfer
is may be because of a complex filter condition that is taking time and
also it
is filtering more records. So in this case also, once the backend tried to
extend
the number of workers, after that it also participate in executing the
plan, then
the backend also takes time to get a tuple by executing the plan locally.
By that
time there are more chances of that workers are already ready with tuples.

The problem of invoking for more number of workers is possible when there is
only one worker that is allotted to the query execution.

Am I missing?

> A different idea is to have an area in shared memory where
> queries can advertise that they didn't get all of the workers they
> wanted, plus a background process that periodically tries to launch
> workers to help those queries as parallel workers become available.
> It can recheck for available workers after some interval, say 10s.
> There are some problems there -- the process won't have bgw_notify_pid
> pointing at the parallel leader -- but I think it might be best to try
> to solve those problems instead of making it the leader's job to try
> to grab more workers as we go along. For one thing, the background
> process idea can attempt to achieve fairness. Suppose there are two
> processes that didn't get all of their workers; one got 3 of 4, the
> other 1 of 4. When a worker becomes available, we'd presumably like
> to give it to the process that got 1 of 4, rather than having the
> leaders race to see who grabs the new worker first. Similarly if
> there are four workers available and two queries that each got 1 of 5
> workers they wanted, we'd like to split the workers two and two,
> rather than having one leader grab all four of them. Or at least, I
> think that's what we want.

Another background process logic can produce a fair distribution of
workers to the parallel queries. In this case also, the backend should
advertise only when the allotted workers are not enough, this is because
there may be a case where the planned workers may be 5, but because
of other part of the query, the main backend is feed by the tuples just by
2 workers, then there is no need to provide extra workers.

The another background process approach of wait interval to reassign
more workers after an interval period doesn't work for the queries that
are getting finished before the configured time of the wait. May be we
can ignore those scenarios?

Needs some smarter logic to share the required details to start the worker
as it is started by the main backend itself. But this approach is useful for
the cases where the query doesn't get any workers I feel.

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-05-16 12:36:11 Re: Bug in ExecModifyTable function and trigger issues for foreign tables
Previous Message Amit Kapila 2017-05-16 12:16:29 Re: NOT NULL constraints on range partition key columns