Re: Is the unfair lwlock behavior intended?

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is the unfair lwlock behavior intended?
Date: 2016-05-24 10:29:37
Message-ID: CAPpHfdtOCPvvL_irrES+M5YZ+jZR8bUSQ7cz39ObjEuOaDsgsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Tue, May 24, 2016 at 9:03 AM, Tsunakawa, Takayuki <
tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:

> I encountered a strange behavior of lightweight lock in PostgreSQL 9.2.
> That appears to apply to 9.6, too, as far as I examine the code. Could you
> tell me if the behavior is intended or needs fix?
>
> Simply put, the unfair behavior is that waiters for exclusive mode are
> overtaken by share-mode lockers who arrive later.
>
>
> PROBLEM
> ====================
>
> Under a heavy read/write workload on a big machine with dozens of CPUs and
> hundreds of GBs of RAM, psql sometimes took more than 30 seconds to connect
> to the database (and actually, it failed to connect due to our
> connect_timeout setting.) The backend corresponding to the psql was
> waiting to acquire exclusive mode lock on ProcArrayLock. Some other
> backends took more than 10 seconds to commit their transactions, waiting
> for exclusive mode lock on ProcArrayLock.
>
> At that time, many backend processes (I forgot the number) were acquiring
> and releasing share mode lock on ProcArrayLock, most of which were from
> TransactionIsInProgress().
>
>
> CAUSE
> ====================
>
> Going into the 9.2 code, I realized that those who request share mode
> don't pay attention to the wait queue. That is, if some processes hold
> share mode lock and someone is waiting for exclusive mode in the wait
> queue, other processes who come later can get share mode overtaking those
> who are already waiting. If many processes repeatedly request share mode,
> the waiters can't get exclusive mode for a long time.
>
> Is this intentional, or should we make the later share-lockers if someone
> is in the wait queue?
>

I've already observed such behavior, see [1]. I think that now there is no
consensus on how to fix that. For instance, Andres express opinion that
this shouldn't be fixed from LWLock side [2].
FYI, I'm planning to pickup work on CSN patch [3] for 10.0. CSN should fix
various scalability issues including high ProcArrayLock contention.

References.

1.
http://www.postgresql.org/message-id/CAPpHfdsytkTFMy3N-zfSo+kAuUx=u-7JG6q2bYB6Fpuw2cD5DQ@mail.gmail.com
2.
http://www.postgresql.org/message-id/20151211130413.GO14789@awork2.anarazel.de
3.
http://www.postgresql.org/message-id/CA+CSw_tEpJ=md1zgxPkjH6CWDnTDft4gBi=+P9SnoC+Wy3pKdA@mail.gmail.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-05-24 11:27:05 pg_dump -j against standbys
Previous Message Tsunakawa, Takayuki 2016-05-24 06:03:07 Is the unfair lwlock behavior intended?