Re: Add progressive backoff to XactLockTableWait functions

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Kevin K Biju <kevinkbiju(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add progressive backoff to XactLockTableWait functions
Date: 2025-08-01 09:25:34
Message-ID: CABPTF7W9CqbisVWT7DwfcnmUW7iUJXd2h9VFd20HmDBD6Ni45A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Jul 30, 2025 at 11:42 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Mon, Jul 28, 2025 at 7:14 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On Thu, Jul 17, 2025 at 10:54 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > After studying proarray and lmgr more closely, I have found several
> > > critical issues in the two patches and underestimated their complexity
> > > and subtlety. Sorry for posting immature patches that may have created
> > > noise.
> > >
> > > I now realize that making lock acquisition— (void) LockAcquire(&tag,
> > > ShareLock, false, false); in XactLockTableWait
> > > work on a standby, following Andres’ first suggestion, may be simpler
> > > and require fewer new helpers.
> > >
> > > XactLockTableWait fails on a standby simply because we never call
> > > XactLockTableInsert. I am considering invoking XactLockTableInsert
> > > when an XID is added to KnownAssignedXids, and XactLockTableDelete(the
> > > constraint for delete now is not used for main xid in primary) when it
> > > is removed. A preliminary implementation and test shows this approach
> > > kinda works, but still need to confirm it is safe on a standby and
> > > does not break other things.
> >
> > Adding an xact lock for every tracked assigned XID on a standby—by
> > enabling XactLockTableInsert and XactLockTableDelete in the
> > KnownAssignedXids infrastructure— appears less attractive than I first
> > thought. The main concern is that the actual rate of xact-lock usage
> > may not justify the added overhead, even if that overhead is modest.
> >
> > 1) Usage
> > On a primary server, a xact lock is taken for every assigned XID, and
> > the cost is justified because xact locks are referenced in many code
> > paths. On a standby, however—especially in versions where logical
> > decoding is disabled—xact locks are used far less, if at all. The
> > current call stacks for XactLockTableWait on a standby (HEAD) are
> > listed in [1]. The only high-frequency caller is the logical walsender
> > in a cascading standby; all other callers are infrequent. Does that
> > single use case warrant creating a xact lock for every known assigned
> > XID? I don't think so, given that typical configurations have few
> > cascading standbys. In practice, most xact locks may never be touched
> > after they are created.
> >
> > 2) Cost
> > Low usage would be fine if the additional cost were negligible, but
> > that does not appear to be the case AFAICS.
> >
> > * There are 5 main uses of the KnownAssignedXids data structure:
> > * * backends taking snapshots - all valid XIDs need to be copied out
> > * * backends seeking to determine presence of a specific XID
> > * * startup process adding new known-assigned XIDs
> > * * startup process removing specific XIDs as transactions end
> > * * startup process pruning array when special WAL records arrive
> > *
> > * This data structure is known to be a hot spot during Hot Standby, so we
> > * go to some lengths to make these operations as efficient and as concurrent
> > * as possible.
> >
> > KnownAssignedXids is a hot spot like the comments state. The existing
> > design replaces a hash table with a single array and minimizes
> > exclusive locks and memory barriers to preserve concurrency. To
> > respect that design, calls to XactLockTableInsert and
> > XactLockTableDelete would need to occur outside any exclusive lock.
> > Unfortunately, that re-introduces the polling we are trying to
> > eliminate in XactLockTableWait: there is a window between registering
> > the XIDs and adding their xact locks. During that window, a backend
> > calling XactLockTableWait may need to poll because
> > TransactionIdIsInProgress shows the transaction as running while the
> > xact lock is still missing. If the window were short, this might be
> > acceptable, but it can be lengthy because XIDs are inserted and
> > deleted from the array in bulk. Some unlucky backends will therefore
> > spin before they can actually wait.
> >
> > Placing the XactLockTableInsert/Delete calls inside the exclusive lock
> > removes the race window but hurts concurrency—the very issue this
> > module strives to avoid. Operations on the shared-memory hash table in
> > lmgr are slower and hold the lock longer than simple array updates, so
> > backends invoking GetSnapshotData could experience longer wait times.
> >
> > 3) Alternative
> > Adopt the hash-table + condition-variable design from patch v6, with
> > additional fixes and polishes.
> > Back-ends that call XactLockTableWait() sleep on the condition
> > variable. The startup process broadcasts a wake-up as the XID is
> > removed from KnownAssignedXids. The broadcast happens after releasing
> > the exclusive lock. This may be safe—sleeping back-ends remain blocked
> > until the CV signal arrives, so no extra polling occurs even though
> > the XID has already vanished from the array. For consistency, all
> > removal paths (including those used during shutdown or promotion,
> > where contention is minimal) follow the same pattern. The hash table
> > is sized at 2 × MaxBackends, which is already conservative for today’s
> > workload. For more potential concurrent use cases in the future, it
> > could be switched to larger numbers like TOTAL_MAX_CACHED_SUBXIDS if
> > necessary; even that would consume only a few megabytes.
> >
> > Feedbacks welcome.
> >
> > [1] https://www.postgresql.org/message-id/CABPTF7Wbp7MRPGsqd9NA4GbcSzUcNz1ymgWfir=Yf+N0oDRbjA@mail.gmail.com
>
> Because XactLocks are not maintained on a standby server, and the
> current waiting approach does not rely on Xact locks as well; I
> therefore added assertions to both functions to prevent their use on a
> standby and introduced an explicit primary-versus-standby branch in
> SnapBuildWaitSnapshot in v8.
>
> In procarray.c, two corrections are made compared with v7:
> Removed the seemingly redundant initialized field from XidWaitEntry.
> Removed the erroneous InHotStandby checks in the wait and wake helpers.
>
> With these changes applied, here's some perf stats for v8, which are
> about one order of magnitude smaller than those of head.
>
> Performance counter stats for process id '3273840':
>
> 351,183,876 cycles
> 126,586,090 instructions # 0.36
> insn per cycle
> 16,670,633 cache-misses
> 10,030 context-switches
>
> 100.001419856 seconds time elapsed
>

V9 replaces the original partitioned xid-wait htab with a single,
unified one, reflecting the modest entry count and rare contention for
waiting. To prevent possible races when multiple backends wait on the
same XID for the first time in XidWaitOnStandby, a dedicated lock has
been added to protect the hash table.

Best,
Xuneng

Attachment Content-Type Size
v9-0001-Optimize-transaction-waiting-during-logical-decod.patch application/octet-stream 15.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2025-08-01 09:39:05 Re: [PATCH v1] Add pg_stat_multixact view for multixact membership usage monitoring
Previous Message shveta malik 2025-08-01 09:20:06 Re: Improve pg_sync_replication_slots() to wait for primary to advance