Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Date: 2016-02-22 13:40:11
Message-ID: CA+TgmobOWq4GMkXzZM+Gjk8b7g5rMgMB8ArZX0QKmO167cabgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Wed, Feb 17, 2016 at 9:48 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I just had a rather disturbing thought to the effect that this entire
> design --- ie, parallel workers taking out locks for themselves --- is
> fundamentally flawed. As far as I can tell from README.parallel,
> parallel workers are supposed to exit (and, presumably, release their
> locks) before the leader's transaction commits. Releasing locks before
> commit is wrong. Do I need to rehearse why?

No, you don't. I've spent a good deal of time thinking about that problem.

In typical cases, workers are going to be acquiring either catalog
locks (which are released before commit) or locks on relations which
the leader has already locked (in which case the leader will still
hold the lock - or possibly a stronger one - even after the worker
releases that lock). Suppose, however, that you write a function
which goes and queries some other table not involved in the query, and
therefore acquires a lock on it. If you mark that function PARALLEL
SAFE and it runs only in the worker and not in in the leader, then you
could end up with a parallel query that releases the lock before
commit where a non-parallel version of that query would have held the
lock until transaction commit. Of course, one answer to this problem
is - if the early lock release is apt to be a problem for you - don't
mark such functions PARALLEL SAFE.

I've thought about engineering a better solution. Two possible
designs come to mind. First, we could have the worker send to the
leader a list of locks that it holds at the end of its work, and the
leader could acquire all of those before confirming to the worker that
it is OK to terminate. That has some noteworthy disadvantages, like
being prone to deadlock and requiring workers to stick around
potentially quite a bit longer than they do at present, thus limiting
the ability of other processes to access parallel query. Second, we
could have the workers reassign all of their locks to the leader in
the lock table (unless the leader already holds that lock). The
problem with that is that then the leader is in the weird situation of
having locks in the shared lock table that it doesn't know anything
about - they don't appear in it's local lock table. How does the
leader decide which resource owner they go with?

Unless I'm missing something, though, this is a fairly obscure
problem. Early release of catalog locks is desirable, and locks on
scanned tables should be the same locks (or weaker) than already held
by the master. Other cases are rare. I think. It would be good to
know if you think otherwise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2016-02-22 13:45:14 Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Previous Message Andres Freund 2016-02-22 06:55:04 pgsql: Fix wrong keysize in PrivateRefCountHash creation.

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-02-22 13:45:14 Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Previous Message Craig Ringer 2016-02-22 13:16:24 Re: Writing new unit tests with PostgresNode