Re: unexpected lock waits (was Re: [GENERAL] Do not understand why this happens)

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bill Moran <wmoran(at)potentialtech(dot)com>, Aln Kapa <alnkapa(at)gmail(dot)com>, Postgres General Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: unexpected lock waits (was Re: [GENERAL] Do not understand why this happens)
Date: 2019-06-09 20:22:44
Message-ID: 20190609202244.GA28125@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

(sorry, very old thread)

On Fri, Mar 15, 2013 at 09:38:06AM -0400, Tom Lane wrote:
> Bill Moran <wmoran(at)potentialtech(dot)com> writes:
> > I do wonder what else is happening in the transaction that you're
> > calling NOTIFY within; and that some other statement could be causing
> > the lock wait.
>
> FWIW, the lock seems to be the one taken to serialize insertions into
> the shared NOTIFY queue, from this bit in commands/async.c:

[SNIP]

> This lock is held while inserting the transaction's notify message(s),
> after which the transaction commits and releases the lock. There's not
> very much code in that window. So what we can conclude is that some
> other transaction also doing NOTIFY hung up within that sequence for
> something in excess of 3 seconds. We have been shown no data whatsoever
> that would allow us to speculate about what's causing that other
> transaction to take so long to get through its commit sequence.

I just want to add that after running into this very same issue (see
[1]) that in our case the above conclusion is incorrect. It is not the
NOTIFYing transactions that are holding the lock too long, but the
LISTENing backends. In our case it is because we have lots of databases
and all databases share a single global NOTIFY queue.

To verify this I made some small patches that significantly reduce the
time LISTENing backends hold the lock and they reduce the problem
significantly for us, see [2]. A slow commit does have a bit of
impact, but the bulk of the time is elsehwere.

[1]: https://www.postgresql.org/message-id/CADWG95t0j9zF0uwdcMH81KMnDsiTAVHxmBvgYqrRJcD-iLwQhw@mail.gmail.com

[2]: https://www.postgresql.org/message-id/CADWG95uLhar1uq6PQLoY1mTQYeN23c1dvOr2tVjcXUBZ1ge9XA@mail.gmail.com

Hope this helps.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> The combine: one man, one day, wheat for half a million loaves of bread.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ray O'Donnell 2019-06-09 20:46:57 Re: Connection refused (0x0000274D/10061)
Previous Message Sourav Majumdar 2019-06-09 19:49:56 Re: Connection refused (0x0000274D/10061)