Re: parallel mode and parallel contexts

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel mode and parallel contexts
Date: 2015-01-07 15:39:09
Message-ID: CA+Tgmob_WLEJ7RCtzD5nrfQB0zrJ=2SdiXzhN9yqj3Ybwi+Vcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 6, 2015 at 9:37 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> CreateParallelContext(): Does it actually make sense to have nworkers=0?
> ISTM that's a bogus case.

I'm not sure whether we'll ever use the zero-worker case in
production, but I've certainly found it useful for
performance-testing.

> Also, since the number of workers will normally be
> determined dynamically by the planner, should that check be a regular
> conditional instead of just an Assert?

I don't think that's really necessary. It shouldn't be too much of a
stretch for the planner to come up with a non-negative value.

> In LaunchParallelWorkers() the "Start Workers" comment states that we give
> up registering more workers if one fails to register, but there's nothing in
> the if condition to do that, and I don't see
> RegisterDynamicBackgroundWorker() doing it either. Is the comment just
> incorrect?

Woops, that got changed at some point and I forgot to update the
comment. Will fix.

> SerializeTransactionState(): instead of looping through the transaction
> stack to calculate nxids, couldn't we just set it to maxsize -
> sizeof(TransactionId) * 3? If we're looping a second time for safety a
> comment explaining that would be useful...

Yeah, I guess we could do that. I don't think it really matters very
much one way or the other.

> sequence.c: Is it safe to read a sequence value in a parallel worker if the
> cache_value is > 1?

No, because the sequence cache isn't synchronized between the workers.
Maybe it would be safe if cache_value == 1, but there's not much
use-case: how often are you going to have a read-only query that uses
a sequence value. At some point we can look at making sequences
parallel-safe, but worrying about it right now doesn't seem like a
good use of time.

> This may be a dumb question, but for functions do we know that all pl's
> besides C and SQL use SPI? If not I think they could end up writing in a
> worker.

Well, the lower-level checks would catch that. But it is generally
true that there's no way to prevent arbitrary C code from doing things
that are unsafe in parallel mode and that we can't tell are unsafe.
As I've said before, I think that we'll need to have a method of
labeling functions as parallel-safe or not, but this patch isn't
trying to solve that part of the problem.

> @@ -2968,7 +2969,8 @@ ProcessInterrupts(void)
> errmsg("canceling statement due to
> user request")));
> }
> }
> - /* If we get here, do nothing (probably, QueryCancelPending was
> reset) */
> + if (ParallelMessagePending)
> + HandleParallelMessageInterrupt(false);
> ISTM it'd be good to leave that comment in place (after the if).
>
> RestoreComboCIDState(): Assert(!comboCids || !comboHash): Shouldn't that be
> &&? AIUI both should always be either set or 0.

Fixed.

> Typo: + elog(ERROR, "cannot update SecondarySnapshpt during a
> parallel operation");

Fixed.

> The comment for RestoreSnapshot refers to set_transaction_snapshot, but that
> doesn't actually exist (or isn't referenced).

Fixed.

Will post a new version in a bit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-01-07 15:41:43 Re: [RFC] LSN Map
Previous Message Aaron Botsis 2015-01-07 15:37:58 Re: Patch: [BUGS] BUG #12320: json parsing with embedded double quotes