Re: LWLock deadlock and gdb advice

From: Andres Freund <andres(at)anarazel(dot)de>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LWLock deadlock and gdb advice
Date: 2015-07-29 11:39:49
Message-ID: 20150729113949.GC10043@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Finally getting to this.

On 2015-07-15 18:44:03 +0300, Heikki Linnakangas wrote:
> Previously, LWLockAcquireWithVar set the variable associated with the lock
> atomically with acquiring it. Before the lwlock-scalability changes, that
> was straightforward because you held the spinlock anyway, but it's a lot
> harder/expensive now. So I changed the way acquiring a lock with a variable
> works. There is now a separate flag, LW_FLAG_VAR_SET, which indicates that
> the current lock holder has updated the variable. The LWLockAcquireWithVar
> function is gone - you now just use LWLockAcquire(), which always clears the
> LW_FLAG_VAR_SET flag, and you can call LWLockUpdateVar() after that if you
> want to set the variable immediately.
>
> This passes make check, but I haven't done any testing beyond that. Does
> this look sane to you?

The prime thing I dislike about the patch is how long it now holds the
spinlock in WaitForVar. I don't understand why that's necessary? There's
no need to hold a spinlock until after the
mustwait = (pg_atomic_read_u32(&lock->state) & LW_VAL_EXCLUSIVE) != 0;
unless I miss something?

In an earlier email you say:
> After the spinlock is released above, but before the LWLockQueueSelf() call,
> it's possible that another backend comes in, acquires the lock, changes the
> variable's value, and releases the lock again. In 9.4, the spinlock was not
> released until the process was queued.

But that's not a problem. The updater in that case will have queued a
wakeup for all waiters, including WaitForVar()?

I'll try to reproduce the problem now. But I do wonder if it's possibly
just the missing spinlock during the update.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2015-07-29 11:47:01 Re: pg_basebackup and replication slots
Previous Message Oleksii Kliukin 2015-07-29 11:06:34 Re: REVOKE [ADMIN OPTION FOR] ROLE