Re: LOCK TABLE .. DEFERRABLE

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Rod Taylor <rod(dot)taylor(at)gmail(dot)com>
Subject: Re: LOCK TABLE .. DEFERRABLE
Date: 2016-12-02 04:18:24
Message-ID: CAJrrPGekozEhrGjy94+Tq8erupfELK-Lo6hE8DNJuA4h5jq3zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 16, 2016 at 3:51 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Sep 6, 2016 at 6:04 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On 1 September 2016 at 21:28, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >> So the only way to handle multiple locks is to do this roughly the way
> >> Rod suggests.
> >
> > I've just been reading the VACUUM code and it turns out that we
> > already use Rod's mechanism internally. So on that basis it seems fine
> > to support this as a useful user-level feature. If there is a better
> > way of doing it, then that can be added later.
> >
> > My proposed changes to this patch are these
> >
> > 1. Rename this WAIT PATIENTLY, which is IMHO a better description of
> > what is being requested. Bikeshedding welcome.
> >
> > 2. Introduce a new API call LockTablePatiently() that returns bool. So
> > its usage is similar to ConditionalLockTable(), the only difference is
> > you supply some other wait parameters with it. This isolates the
> > internal mechanism from the usage, so allows us to more easily support
> > any fancy new way of doing this we think of later.
> >
> > 3. Use LockTablePatiently() within lockcmds.c where appropriate
> >
> > 4. Replace the code for waiting in VACUUM with the new call to
> > LockTablePatiently()
> >
> > So I see this as 2 patches: 1) new API and make VACUUM use new API, 2)
> > Rod's LOCK TABLE patch
> >
> > First patch attached, requires also lock by Oid. If we agree, Rod,
> > please update your patch to match?
>
> Aside from the fact that polling is generally inefficient and wasteful
> of system resources, this allows for undetected deadlocks. Consider:
>
> S1: LOCK TABLE A;
> S2: LOCK TABLE B;
> S1: LOCK TABLE B; -- blocks
> S2: LOCK TABLE A PATIENTLY; -- retries forever
>
> Livelock might be possible, too.
>
> I think it would be better to think harder about what would be
> required to implement this natively in the lock manager. Suppose we
> add a flag to each PROCLOCK which, if true, indicates that the lock
> request is low priority. Also, we add a counter to each LOCK
> indicating the number of pending low-priority lock requests. When
> LockAcquireExtended reaches this code here...
>
> if (lockMethodTable->conflictTab[lockmode] & lock->waitMask)
> status = STATUS_FOUND;
> else
> status = LockCheckConflicts(lockMethodTable, lockmode,
>
> lock, proclock);
>
> ...we add an additional check to the upper branch: if the number of
> low-priority waiters is not 0, then we walk the wait queue; if all
> waiters that conflict with the requested lock mode are low-priority,
> then we set status = STATUS_OK. So, new lock requests refuse to queue
> behind low-priority lock waiters.
>
> Is that good enough to implement the requested behavior, or do we need
> to do more? If we only do what's described above, then a
> normal-priority waiter which joins the queue after a low-priority
> waiter is already waiting will let the low-priority waiter go first.
> That's probably not desirable, but it's pretty easy to fix: the logic
> that determines where a new waiter enters the wait queue is in
> ProcSleep(), and it wouldn't be too hard to arrange for new
> normal-priority waiters to skip over any low-priority waiters that are
> at the end of the existing wait queue (but not any that are in the
> middle, because if we did that we'd also be skipping over
> normal-priority waiters, which we shouldn't).
>
> What more? Well, even after doing those two things, it's still
> possible for either the simple deadlock logic in ProcSleep() or the
> full deadlock detector to put a low-priority waiter in front of a
> normal-priority waiter. However, our typical policy is to advance
> waiters in the wait queue as little as possible. In other words, if
> the wait queue contains A B C and we will deadlock unless C is moved
> up, we will move it ahead of B but not A if that is sufficient to
> avoid the deadlock. We will only move it ahead of both B and A if
> that is necessary to avoid deadlock. So, low-priority requests won't
> be moved up further than needed, which is good.
>
> Still, it is possible to construct scenarios where we won't get
> perfect low-priority behavior without more invasive changes. For
> example, suppose we make a low-priority request queue-jump over an
> intervening waiter to avoid deadlocking against it. Next, a
> normal-priority waiter enters the queue. Then, the guy we skipped
> aborts. At this point, we could in theory notice that it's possible
> to move the low-priority request behind the new normal-priority
> waiter. However, I think we shouldn't do that. We certainly can't do
> it unconditionally because it might introduce deadlocks. We could
> test whether it will introduce a deadlock and do it only if not, but
> that's expensive. Instead, I think we should document that a
> low-priority request will ordinarily cause the request to be satisfied
> only after all conflicting normal-priority lock requests, but that
> this is not guaranteed in the case where the wait queue is rearranged
> to avoid deadlock. I don't think that limitation ought to be a
> tremendous problem for users, and the alternatives are pretty
> unappealing.
>

Closed in 2016-11 commitfest with "returned with feedback" status.
Please feel free to update the status once you submit the update patch
that solves all the discussed problems.

Regards,
Hari Babu
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-12-02 04:20:15 Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
Previous Message Haribabu Kommi 2016-12-02 04:16:08 Re: WIP: Covering + unique indexes.