ExecGather() + nworkers

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: ExecGather() + nworkers
Date: 2016-01-10 05:29:39
Message-ID: CAM3SWZQbbJ37xEjNB7kXs3e6OZnOw4HXC05PHOiRdJ74=iZwdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The Gather node executor function ExecGather() does this:

/*
* Register backend workers. We might not get as many as we
* requested, or indeed any at all.
*/
pcxt = node->pei->pcxt;
LaunchParallelWorkers(pcxt);

/* Set up tuple queue readers to read the results. */
if (pcxt->nworkers > 0)
{
...
}

/* No workers? Then never mind. */
if (!got_any_worker)
ExecShutdownGatherWorkers(node);

I'm not sure why the test for nworkers following the
LaunchParallelWorkers() call doesn't look like this, though:

/* Set up tuple queue readers to read the results. */
if (pcxt->nworkers_launched > 0)
{
...
}

ISTM, having followed the chain of calls, that the "if" statement I
highlight here is currently tautological (excluding the possibility of
one or two special cases in the CreateParallelContext() call performed
by ExecInitParallelPlan(). e.g., the IsolationIsSerializable() case
*can* result in caller's nworkers being overridden to be 0).

I guess it isn't surprising that the code evolved to look like this,
since the commit message of b0b0d84b ponders: "I suspect we'll want to
revise the Gather node to make use of this new capability [relaunching
workers], but even if not it may be useful elsewhere and requires very
little additional code". The nworkers_launched tracking came only in
this later commit.

From my point of view, as a student of this code working on parallel
index builds (i.e new code which will also be a client of parallel.c,
a module which right now only has nodeGather.c as a client to learn
from), this is confusing. It's confusing just because the simpler
approach isn't taken when it could have been. It isn't actually wrong
that we figure out if any workers were launched in this relatively
laborious way:

/* Set up tuple queue readers to read the results. */
if (pcxt->nworkers > 0)
{
...

for (i = 0; i < pcxt->nworkers; ++i)
{
if (pcxt->worker[i].bgwhandle == NULL)
continue;

...

nworkers_launched = true;
...

But going to this additional trouble (detecting no workers launched on
the basis of !nworkers_launched) suggests that simply testing
nworkers_launched would be wrong, which AFAICT it isn't. Can't we just
do that, and in so doing also totally remove the "for" loop shown
here?

In the case of parallel sequential scan, it looks like one worker can
be helpful, because then the gather node (leader process) can run the
plan itself to some degree, and so there are effectively 2 processes
scanning at a minimum (unless 0 workers could be allocated to begin
with). How useful is it to have a parallel scan when this happens,
though?

I guess it isn't obvious to me how to reliably back out of not being
able to launch at least 2 workers in the case of my parallel index
build patch, because I suspect 2 workers (plus the leader process) are
the minimum number that will make index builds faster. Right now, it
looks like I'll have to check nworkers_launched in the leader (which
will be the only process to access the ParallelContext, since it's in
its local memory). Then, having established that there are at least
the minimum useful number of worker processes launched for sorting,
the leader can "permit" worker processes to "really" start based on
changing some state in the TOC/segment in common use. Otherwise, the
leader must call the whole thing off and do a conventional, serial
index build, even though technically the main worker process function
has started execution in worker processes.

I think what might be better is a general solution to my problem,
which I imagine will crop up again and again as new clients are added.
I would like an API that lets callers of LaunchParallelWorkers() only
actually launch *any* worker on the basis of having been able to
launch some minimum sensible number (typically 2). Otherwise, indicate
failure, allowing callers to call the whole thing off in a general
way, without the postmaster having actually launched anything, and
without custom "call it all off" code for parallel index builds. This
would probably involve introducing a distinction between a
BackgroundWorkerSlot being "reserved" rather than "in_use", lest the
postmaster accidentally launch 1 worker process before we established
definitively that launching any is really a good idea.

Does that make sense?

Thanks
--
Peter Geoghegan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-01-10 06:34:57 Spelling corrections
Previous Message Peter Geoghegan 2016-01-10 04:11:01 Re: WIP: bloom filter in Hash Joins with batches