Re: Configurable FP_LOCK_SLOTS_PER_BACKEND

From: Matt Smiley <msmiley(at)gitlab(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Date: 2023-08-07 20:59:26
Message-ID: CA+eRB3qYKAn3SVB_1wwYNQx36hLFkm6-th=gCPxxczQXxE_B6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Andres, thanks for helping! Great questions, replies are inline below.

On Sun, Aug 6, 2023 at 1:00 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hm, I'm curious whether you have a way to trigger the issue outside of your
> prod environment. Mainly because I'm wondering if you're potentially
> hitting
> the issue fixed in a4adc31f690 - we ended up not backpatching that fix, so
> you'd not see the benefit unless you reproduced the load in 16+.
>

Thanks for sharing this!

I have not yet written a reproducer since we see this daily in production.
I have a sketch of a few ways that I think will reproduce the behavior
we're observing, but haven't had time to implement it.

I'm not sure if we're seeing this behavior in production, but it's
definitely an interesting find. Currently we are running postgres 12.11,
with an upcoming upgrade to 15 planned. Good to know there's a potential
improvement waiting in 16. I noticed that in LWLockAcquire the call to
LWLockDequeueSelf occurs (
https://github.com/postgres/postgres/blob/REL_12_11/src/backend/storage/lmgr/lwlock.c#L1218)
directly between the unsuccessful attempt to immediately acquire the lock
and reporting the backend's wait event. The distinctive indicators we have
been using for this pathology are that "lock_manager" wait_event and its
associated USDT probe (
https://github.com/postgres/postgres/blob/REL_12_11/src/backend/storage/lmgr/lwlock.c#L1236-L1237),
both of which occur after whatever overhead is incurred by
LWLockDequeueSelf. As you mentioned in your commit message, that overhead
is hard to detect. My first impression is that whatever overhead it incurs
is in addition to what we are investigating.

> I'm also wondering if it's possible that the reason for the throughput
> drops
> are possibly correlated with heavyweight contention or higher frequency
> access
> to the pg_locks view. Deadlock checking and the locks view acquire locks on
> all lock manager partitions... So if there's a bout of real lock contention
> (for longer than deadlock_timeout)...
>

Great questions, but we ruled that out. The deadlock_timeout is 5 seconds,
so frequently hitting that would massively violate SLO and would alert the
on-call engineers. The pg_locks view is scraped a couple times per minute
for metrics collection, but the lock_manager lwlock contention can be
observed thousands of times every second, typically with very short
durations. The following example (captured just now) shows the number of
times per second over a 10-second window that any 1 of the 16
"lock_manager" lwlocks was contended:

msmiley(at)patroni-main-2004-103-db-gprd(dot)c(dot)gitlab-production(dot)internal:~$ sudo
./bpftrace -e 'usdt:/usr/lib/postgresql/12/bin/postgres:lwlock__wait__start
/str(arg0) == "lock_manager"/ { @[arg1] = count(); } interval:s:1 {
print(@); clear(@); } interval:s:10 { exit(); }'
Attaching 5 probes...
@[0]: 12122
@[0]: 12888
@[0]: 13011
@[0]: 13348
@[0]: 11461
@[0]: 10637
@[0]: 10892
@[0]: 12334
@[0]: 11565
@[0]: 11596

Typically that contention only lasts a couple microseconds. But the long
tail can sometimes be much slower. Details here:
https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365159507
.

Given that most of your lock manager traffic comes from query planning -
> have
> you evaluated using prepared statements more heavily?
>

Yes, there are unrelated obstacles to doing so -- that's a separate can of
worms, unfortunately. But in this pathology, even if we used prepared
statements, the backend would still need to reacquire the same locks during
each executing transaction. So in terms of lock acquisition rate, whether
it's via the planner or executor doing it, the same relations have to be
locked.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-08-07 21:16:25 Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Previous Message Robert Haas 2023-08-07 20:31:02 Re: Configurable FP_LOCK_SLOTS_PER_BACKEND