Re: parallel mode and parallel contexts

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel mode and parallel contexts
Date: 2015-01-05 16:27:57
Message-ID: CA+TgmoZZ+vODB=dNKY_RSAHL0j0mUnZWNJX7ecvhy0pmyxNzAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 3, 2015 at 7:31 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> A couple remarks:
> * Shouldn't this provide more infrastructure to deal with the fact that
> we might get less parallel workers than we had planned for?

Maybe, but I'm not really sure what that should look like. My working
theory is that individual parallel applications (executor nodes, or
functions that use parallelism internally, or whatever) should be
coded in such a way that they work correctly even if the number of
workers that starts is smaller than what they requested, or even zero.
It may turn out that this is impractical in some cases, but I don't
know what those cases are yet so I don't know what the common threads
will be.

I think parallel seq scan should work by establishing N tuple queues
(where N is the number of workers we have). When asked to return a
tuple, the master should poll all of those queues one after another
until it finds one that contains a tuple. If none of them contain a
tuple then it should claim a block to scan and return a tuple from
that block. That way, if there are enough running workers to keep up
with the master's demand for tuples, the master won't do any of the
actual scan itself. But if the workers can't keep up -- e.g. suppose
90% of the CPU consumed by the query is in the filter qual for the
scan -- then the master can pitch in along with everyone else. As a
non-trivial fringe benefit, if the workers don't all start, or take a
while to initialize, the user backend isn't stalled meanwhile.

> * I wonder if parallel contexts shouldn't be tracked via resowners

That is a good question. I confess that I'm somewhat fuzzy about
which things should be tracked via the resowner mechanism vs. which
things should have their own bespoke bookkeeping. However, the
AtEOXact_Parallel() stuff happens well before ResourceOwnerRelease(),
which makes merging them seem not particularly clean.

> * combocid.c should error out in parallel mode, as insurance

Eh, which function? HeapTupleHeaderGetCmin(),
HeapTupleHeaderGetCmax(), and AtEOXact_ComboCid() are intended to work
in parallel mode. HeapTupleHeaderAdjustCmax() could
Assert(!IsInParallelMode()) but the only calls are in heap_update()
and heap_delete() which already have error checks, so putting yet
another elog() there seems like overkill.

> * I doubt it's a good idea to allow heap_insert, heap_inplace_update,
> index_insert. I'm not convinced that those will be handled correct and
> relaxing restrictions is easier than adding them.

I'm fine with adding checks to heap_insert() and
heap_inplace_update(). I'm not sure we really need to add anything to
index_insert(); how are we going to get there without hitting some
other prohibited operation first?

> * I'd much rather separate HandleParallelMessageInterrupt() in one
> variant that just tells the machinery there's a interrupt (called from
> signal handlers) and one that actually handles them. We shouldn't even
> consider adding any new code that does allocations, errors and such in
> signal handlers. That's just a *bad* idea.

That's a nice idea, but I didn't invent the way this crap works today.
ImmediateInterruptOK gets set to true while performing a lock wait,
and we need to be able to respond to errors while in that state. I
think there's got to be a better way to handle that than force every
asynchronous operation to have to cope with the fact that
ImmediateInterruptOK may be set or not set, but as of today that's
what you have to do.

> * I'm not a fan of the shm_toc stuff... Verbose and hard to read. I'd
> much rather place a struct there and be careful about not using
> pointers. That also would obliviate the need to reserve some ids.

I don't see how that's going to work with variable-size data
structures, and a bunch of the things that we need to serialize are
variable-size. Moreover, I'm not really interested in rehashing this
design again. I know it's not your favorite; you've said that before.
But it makes it possible to write code to do useful things in
parallel, a capability that we do not otherwise have. And I like it
fine.

> * Doesn't the restriction in GetSerializableTransactionSnapshotInt()
> apply for repeatable read just the same?

Yeah. I'm not sure whether we really need that check at all, because
there is a check in GetTransactionSnapshot() that probably checks the
same thing.

> * I'm not a fan of relying on GetComboCommandId() to restore things in
> the same order as before.

I thought that was a little wonky, but it's not likely to break, and
there is an elog(ERROR) there to catch it if it does, so I'd rather
not make it more complicated.

> * I'd say go ahead and commit the BgworkerByOid thing
> earlier/independently. I've seen need for that a couple times.

Woohoo! I was hoping someone would say that.

> * s/entrypints/entrypoints/

Thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-01-05 16:40:20 Re: Publish autovacuum informations
Previous Message David G Johnston 2015-01-05 15:59:52 Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs