Re: XactLockTableWait doesn't set wait_event correctly

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XactLockTableWait doesn't set wait_event correctly
Date: 2016-11-29 19:04:40
Message-ID: CA+Tgmoa6K49wfSWbmPmY793-wz9bLm6-i8MvkJMz7hf2ANg9ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 29, 2016 at 6:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> New (9.4) XactLockTableWait() sets the reason for the wait, so that if
> we emit a log message then it will produce a log message to say its
> not waiting on a transaction, its waiting on a lock.
>
> Nice. log_lock_waits works great.
>
> New (9.6) wait_event code is not correctly set. In ProcSleep() we set
> the wait reason according to the actual lock tag, effectively ignoring
> the information specifically provided by XactLockTableWait().
>
> Ugh.
>
> pg_stat_activity shows a wait_event of 'transactionid' rather than 'tuple'
> pg_locks shows a transactionid lock rather than a wait for tuple lock
>
> That looks like a bug since we get different answers from
> log_lock_wait and wait_event, which is annoying and confusing.

What you find annoying or confusing is up to you, but it's not a bug,
and it's not particularly inconsistent, either. log_lock_waits
reports a transaction lock, pg_stat_activity reports a transactionid
lock, and pg_locks reports a transactionid lock. Now, it is true *in
addition* to reporting that the lock in question is a transaction
lock, log_lock_waits also reports some context information:

2016-11-29 13:49:59.702 EST [33658] LOG: process 33658 still waiting
for ShareLock on transaction 2997486 after 1000.748 ms
2016-11-29 13:49:59.702 EST [33658] DETAIL: Process holding the lock:
33653. Wait queue: 33658.
2016-11-29 13:49:59.702 EST [33658] CONTEXT: while updating tuple
(0,67) in relation "pgbench_accounts"
2016-11-29 13:49:59.702 EST [33658] STATEMENT: update
pgbench_accounts set filler = 'b' where aid = 1;

...but right there on the first line, it clearly says "waiting for
ShareLock on transaction 2997".

I'm not totally unsympathetic to the argument that perhaps
XactLockTableWait() ought to pass down the context information through
LockAcquire() so that it reaches ProcSleep() and we can set the wait
event to something like Tuple/Update instead of Lock/Transaction. But
I don't think it's a slam-dunk, either. If we did that, it would be
consistent with the additional detail provided by log_lock_waits
rather than the primary message, and it would be inconsistent with
pg_locks. Also, I think that the reason why we originally used an
error-context callback here rather than passing down some additional
detail into LockAcquire() is that the breaking the abstraction layer
seemed messy and unappealing. IIRC, Tom, in particular, was concerned
about that last point.

But feel free to propose a patch...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-11-29 19:13:57 Re: GiST support for UUIDs
Previous Message Amit Kapila 2016-11-29 19:03:49 Re: XactLockTableWait doesn't set wait_event correctly