Re: ExecGather() + nworkers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ExecGather() + nworkers
Date: 2016-01-10 17:13:38
Message-ID: CA+Tgmoajv7C8vvOajtsBuesP0ge07LFEY4vz+4ciRCd5FCdnGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 10, 2016 at 12:29 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> The Gather node executor function ExecGather() does this:
> [ code ]
> I'm not sure why the test for nworkers following the
> LaunchParallelWorkers() call doesn't look like this, though:
>
> /* Set up tuple queue readers to read the results. */
> if (pcxt->nworkers_launched > 0)
> {
> ...
> }

Hmm, yeah, I guess it could do that.

> But going to this additional trouble (detecting no workers launched on
> the basis of !nworkers_launched) suggests that simply testing
> nworkers_launched would be wrong, which AFAICT it isn't. Can't we just
> do that, and in so doing also totally remove the "for" loop shown
> here?

I don't see how the for loop goes away.

> In the case of parallel sequential scan, it looks like one worker can
> be helpful, because then the gather node (leader process) can run the
> plan itself to some degree, and so there are effectively 2 processes
> scanning at a minimum (unless 0 workers could be allocated to begin
> with). How useful is it to have a parallel scan when this happens,
> though?

Empirically, that's really quite useful. When you have 3 or 4
workers, the leader really doesn't make a significant contribution to
the work, but what I've seen in my testing is that 1 worker often runs
almost twice as fast as 0 workers.

> I guess it isn't obvious to me how to reliably back out of not being
> able to launch at least 2 workers in the case of my parallel index
> build patch, because I suspect 2 workers (plus the leader process) are
> the minimum number that will make index builds faster. Right now, it
> looks like I'll have to check nworkers_launched in the leader (which
> will be the only process to access the ParallelContext, since it's in
> its local memory). Then, having established that there are at least
> the minimum useful number of worker processes launched for sorting,
> the leader can "permit" worker processes to "really" start based on
> changing some state in the TOC/segment in common use. Otherwise, the
> leader must call the whole thing off and do a conventional, serial
> index build, even though technically the main worker process function
> has started execution in worker processes.

I don't really understand why this should be so. I thought the idea
of parallel sort is (roughly) that each worker should read data until
it fills work_mem, sort that data, and write a tape. Repeat until no
data remains. Then, merge the tapes. I don't see any reason at all
why this shouldn't work just fine with a leader and 1 worker.

> I think what might be better is a general solution to my problem,
> which I imagine will crop up again and again as new clients are added.
> I would like an API that lets callers of LaunchParallelWorkers() only
> actually launch *any* worker on the basis of having been able to
> launch some minimum sensible number (typically 2). Otherwise, indicate
> failure, allowing callers to call the whole thing off in a general
> way, without the postmaster having actually launched anything, and
> without custom "call it all off" code for parallel index builds. This
> would probably involve introducing a distinction between a
> BackgroundWorkerSlot being "reserved" rather than "in_use", lest the
> postmaster accidentally launch 1 worker process before we established
> definitively that launching any is really a good idea.

I think that's probably over-engineered. I mean, it wouldn't be that
hard to have the workers just exit if you decide you don't want them,
and I don't really want to make the signaling here more complicated
than it really needs to be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-01-10 17:16:26 Re: strange CREATE INDEX tab completion cases
Previous Message Robert Haas 2016-01-10 16:48:34 Re: Fwd: Re: [DOCS] Document Upper Limit for NAMEDATELEN in pgsql 9.5+