Re: SERIALIZABLE with parallel query

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SERIALIZABLE with parallel query
Date: 2017-02-16 12:49:59
Message-ID: CAEepm=1d5K-hRXjERLia3WhmwExH61q0gfS7_WWdsi=b_0KdoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 16, 2017 at 2:58 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Nov 8, 2016 at 4:51 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> Currently we don't generate parallel plans in SERIALIZABLE. What
>> problems need to be solved to be able to do that? I'm probably
>> steamrolling over a ton of subtleties and assumptions here, but it
>> occurred to me that a first step might be something like this:
>>
>> 1. Hand the leader's SERIALIZABLEXACT to workers.
>> 2. Make sure that workers don't release predicate locks.
>> 3. Make sure that the leader doesn't release predicate locks until
>> after the workers have finished accessing the leader's
>> SERIALIZABLEXACT. I think this is already the case.
>
> What happens if the workers exit at the end of the query and the
> leader then goes on and executes more queries? Will the
> worker-acquired predicate locks be retained or will they go away when
> the leader exits?

All predicate locks last at least until ReleasePredicateLocks() run
after ProcReleaseLocks(), and sometimes longer. Although
ReleasePredicateLocks() runs in workers too, this patch makes it
return without doing anything. I suppose someone could say that
ReleasePredicateLocks() should at least run
hash_destroy(LocalPredicateLockHash) and set LocalPredicateLockHash to
NULL in workers. This sort of thing could be important if we start
reusing worker processes. I'll do that in the next version.

The predicate locks themselves consist of state in shared memory, and
those created by workers are indistinguishable from those created by
the leader process. Having multiple workers and the leader all
creating predicate locks linked to the same SERIALIZABLEXACT is
*almost* OK, because most relevant shmem state is protected by locks
already in all paths (with the exception of the DOOMED flag already
mentioned, which seems to follow a "notice me as soon as possible"
philosophy, to avoid putting locking into the
CheckForSerializableConflict(In|Out) paths, with a definitive check at
commit time).

But... there is a problem with the management of the linked list of
predicate locks held by a transactions. The locks themselves are
covered by partition locks, but the links are not, and that previous
patch broke the assumption that they could never be changed by another
process.

Specifically, DeleteChildTargetLocks() assumes it can walk
MySerializableXact->predicateLocks and throw away locks that are
covered by a new lock (ie throw away tuple locks because a covering
page lock has been acquired) without let or hindrance until it needs
to modify the locks themselves. That assumption doesn't hold up with
that last patch and will require a new kind of mutual exclusion. I
wonder if the solution is to introduce an LWLock into each
SERIALIZABLEXACT object, so DeleteChildTargetLocks() can prevent
workers from stepping on each others' toes during lock cleanup. An
alternative would be to start taking SerializablePredicateLockListLock
in exclusive rather than shared mode, but that seems unnecessarily
course.

I have a patch that implements the above but I'm still figuring out
how to test it, and I'll need to do a bit more poking around for other
similar assumptions before I post a new version.

I tried to find any way that LocalPredicateLockHash could create
problems, but it's effectively a cache with
false-negatives-but-never-false-positives semantics. In cache-miss
scenarios it we look in shmem data structures and are prepared to find
that our SERIALIZABLEXACT already has the predicate lock even though
there was a cache miss in LocalPredicateLockHash. That works because
our SERIALIZABLEXACT's address is part of the tag, and it's stable
across backends.

Random observation: The global variable MyXactDidWrite would probably
need to move into shared memory when parallel workers eventually learn
to write.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2017-02-16 13:05:54 Re: [PATCH] kNN for btree
Previous Message Robert Haas 2017-02-16 12:41:37 Re: [PROPOSAL] Temporal query processing with range types