Re: Increasing parallel workers at runtime

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increasing parallel workers at runtime
Date: 2017-05-15 15:53:23
Message-ID: CA+TgmoaBOaFJCbQWHEeU2aXb=PJK02u+EETByPY8m49hGvKHzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 15, 2017 at 10:06 AM, Haribabu Kommi
<kommi(dot)haribabu(at)gmail(dot)com> wrote:
> This still needs some adjustments to fix for the cases where
> the main backend also does the scan instead of waiting for
> the workers to finish the job. As increasing the workers logic
> shouldn't add an overhead in this case.

I think it would be pretty crazy to try relaunching workers after
every tuple, as this patch does. The overhead of that will be very
high for queries where the number of tuples passing through the Gather
is large, whereas when the number of tuples passing through Gather is
small, or where tuples are sent all at once at the end of procesisng,
it will not actually be very effective at getting hold of more
workers. A different idea is to have an area in shared memory where
queries can advertise that they didn't get all of the workers they
wanted, plus a background process that periodically tries to launch
workers to help those queries as parallel workers become available.
It can recheck for available workers after some interval, say 10s.
There are some problems there -- the process won't have bgw_notify_pid
pointing at the parallel leader -- but I think it might be best to try
to solve those problems instead of making it the leader's job to try
to grab more workers as we go along. For one thing, the background
process idea can attempt to achieve fairness. Suppose there are two
processes that didn't get all of their workers; one got 3 of 4, the
other 1 of 4. When a worker becomes available, we'd presumably like
to give it to the process that got 1 of 4, rather than having the
leaders race to see who grabs the new worker first. Similarly if
there are four workers available and two queries that each got 1 of 5
workers they wanted, we'd like to split the workers two and two,
rather than having one leader grab all four of them. Or at least, I
think that's what we want.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-05-15 15:53:49 Re: postgres 9.6.2 update breakage
Previous Message Tom Lane 2017-05-15 15:49:14 Re: Create publication syntax is not coming properly in pg_dump / pg_dumpall