Quick Links

Re: Configurable FP_LOCK_SLOTS_PER_BACKEND

From:	Matt Smiley <msmiley(at)gitlab(dot)com>
To:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc:	Nikolay Samokhvalov <nik(at)postgres(dot)ai>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Date:	2023-08-02 23:51:29
Message-ID:	CA+eRB3qn8crRpykquMd4VO-LdKcqEPQRW6k_XWih7N0CeODfvw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I thought it might be helpful to share some more details from one of the
case studies behind Nik's suggestion.

Bursty contention on lock_manager lwlocks recently became a recurring cause
of query throughput drops for GitLab.com, and we got to study the behavior
via USDT and uprobe instrumentation along with more conventional
observations (see
https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301) This
turned up some interesting finds, and I thought sharing some of that
research might be helpful.

Results so far suggest that increasing FP_LOCK_SLOTS_PER_BACKEND would have
a much larger positive impact than any other mitigation strategy we have
evaluated. Rather than reducing hold duration or collision rate, adding
fastpath slots reduces the frequency of even having to acquire those
lock_manager lwlocks. I suspect this would be helpful for many other
workloads, particularly those having high frequency queries whose tables
collectively have more than about 16 or indexes.

Lowering the lock_manager lwlock acquisition rate means lowering its
contention rate (and probably also its contention duration, since exclusive
mode forces concurrent lockers to queue).

I'm confident this would help our workload, and I strongly suspect it would
be generally helpful by letting queries use fastpath locking more often.

> However, the lmgr/README says this is meant to alleviate contention on
> the lmgr partition locks. Wouldn't it be better to increase the number
> of those locks, without touching the PGPROC stuff?

That was my first thought too, but growing the lock_manager lwlock tranche
isn't nearly as helpful.

On the slowpath, each relation's lock tag deterministically hashes onto a
specific lock_manager lwlock, so growing the number of lock_manager lwlocks
just makes it less likely for two or more frequently locked relations to
hash onto the same lock_manager.

In contrast, growing the number of fastpath slots completely avoids calls
to the slowpath (i.e. no need to acquire a lock_manager lwlock).

The saturation condition we'd like to solve is heavy contention on one or
more of the lock_manager lwlocks. Since that is driven by the slowpath
acquisition rate of heavyweight locks, avoiding the slowpath is better than
just moderately reducing the contention on the slowpath.

To be fair, increasing the number of lock_manager locks definitely can help
to a certain extent, but it doesn't cover an important general case. As a
thought experiment, suppose we increase the lock_manager tranche to some
arbitrarily large size, larger than the number of relations in the db.
This unrealistically large size means we have the best case for avoiding
collisions -- each relation maps uniquely onto its own lock_manager
lwlock. That helps a lot in the case where the workload is spread among
many non-overlapping sets of relations. But it doesn't help a workload
where any one table is accessed frequently via slowpath locking.

Continuing the thought experiment, if that frequently queried table has 16
or more indexes, or if it is joined to other tables that collectively add
up to over 16 relations, then each of those queries is guaranteed to have
to use the slowpath and acquire the deterministically associated
lock_manager lwlocks.

So growing the tranche of lock_manager lwlocks would help for some
workloads, while other workloads would not be helped much at all. (As a
concrete example, a workload at GitLab has several frequently queried
tables with over 16 indexes that consequently always use at least some
slowpath locks.)

For additional context:

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#what-influences-lock_manager-lwlock-acquisition-rate
Summarizes the pathology and its current mitigations.

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1357834678
Documents the supporting research methodology.

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365370510
What code paths require an exclusive mode lwlock for lock_manager?

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365595142
Comparison of fastpath vs. slowpath locking, including quantifying the rate
difference.

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365630726
Confirms the acquisition rate of lock_manager locks is not uniform. The
sampled workload has a 3x difference in the most vs. least frequently
acquired lock_manager lock, corresponding to the workload's most frequently
accessed relations.

> Well, that has a cost too, as it makes PGPROC larger, right? At the
> moment that struct is already ~880B / 14 cachelines, adding 48 XIDs
> would make it +192B / +3 cachelines. I doubt that won't impact other
> common workloads ...

That's true; growing the data structure may affect L2/L3 cache hit rates
when touching PGPROC. Is that cost worth the benefit of using fastpath for
a higher percentage of table locks? The answer may be workload- and
platform-specific. Exposing this as a GUC gives the admin a way to make a
different choice if our default (currently 16) is bad for them.

I share your reluctance to add another low-level tunable, but like many
other GUCs, having a generally reasonable default that can be adjusted is
better than forcing folks to fork postgres to adjust a compile-time
constant. And unfortunately I don't see a better way to solve this
problem. Growing the lock_manager lwlock tranche isn't as effective,
because it doesn't help workloads where one or more relations are locked
frequently enough to hit this saturation point.

Handling a larger percentage of heavyweight lock acquisitions via fastpath
instead of slowpath seems likely to help many high-throughput workloads,
since it avoids having to exclusively acquire an lwlock. It seemed like
the least intrusive general-purpose solution we've come up with so far.
That's why we wanted to solicit feedback or new ideas from the community.
Currently, the only options folks have to solve this class of saturation
are through some combination of schema changes, application changes,
vertical scaling, and spreading the query rate among more postgres
instances. Those are not feasible and efficient options. Lacking a better
solution, exposing a GUC that rarely needs tuning seems reasonable to me.

Anyway, hopefully the extra context is helpful! Please do share your
thoughts.

--
*Matt Smiley* | Staff Site Reliability Engineer at GitLab

In response to

Re: Configurable FP_LOCK_SLOTS_PER_BACKEND at 2023-07-13 09:30:19 from Tomas Vondra

Responses

Re: Configurable FP_LOCK_SLOTS_PER_BACKEND at 2023-08-03 20:39:14 from Tomas Vondra
Re: Configurable FP_LOCK_SLOTS_PER_BACKEND at 2023-08-06 20:00:49 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andy Fan	2023-08-03 00:50:47	Re: Extract numeric filed in JSONB more effectively
Previous Message	Tom Lane	2023-08-02 21:48:31	First draft of back-branch release notes is done