Re: Add progressive backoff to XactLockTableWait functions

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Kevin K Biju <kevinkbiju(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add progressive backoff to XactLockTableWait functions
Date: 2025-08-29 07:04:20
Message-ID: CABPTF7V25W6KshXCi3cG5oWY+temCQWs0x+tmrU3iSen4jJQVQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Aug 8, 2025 at 7:06 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi, Tom!
>
> Thanks for looking at this.
>
> On Fri, Aug 8, 2025 at 2:20 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Xuneng Zhou <xunengzhou(at)gmail(dot)com> writes:
> > > V9 replaces the original partitioned xid-wait htab with a single,
> > > unified one, reflecting the modest entry count and rare contention for
> > > waiting. To prevent possible races when multiple backends wait on the
> > > same XID for the first time in XidWaitOnStandby, a dedicated lock has
> > > been added to protect the hash table.
> >
> > This seems like adding quite a lot of extremely subtle code in
> > order to solve a very small problem. I thought the v1 patch
> > was about the right amount of complexity.
>
> Yeah, this patch is indeed complex, and the complexity might not be
> well-justified—given the current use cases, it feels like we’re paying
> a lot for very little. TBH, getting the balance right between
> efficiency gains and cost, in terms of both code complexity and
> runtime overhead, is beyond my current ability here, since I’m
> touching many parts of the code for the first time. Every time I
> thought I’d figured it out, new subtleties surfaced—though I’ve
> learned a lot from the exploration and hacking. We may agree on the
> necessity of fixing this issue, but not yet on how to fix it. I’m open
> to discussion and suggestions.
>

Some changes in v10:

1) XidWaitHashLock is used for all operations on XidWaitHash though
might be unnecessary for some cases.
2) Field pg_atomic_uint32 waiter_count was removed from the
XidWaitEntry. The start process now takes charge of cleaning up the
XidWaitHash entry after waking up processes.
3) pg_atomic_uint32 xidWaiterNum is added to avoid unnecessary lock
acquire & release and htab look-up while there's no xid waiting.

Hope this could eliminate some subtleties.

Exponential backoff in earlier patches is simple and effective for
alleviating cpu overhead in extended waiting; however it could also
bring unwanted latency for more sensitive use cases like logical
walsender on cascading standbys. Unfortunately, I am unable to come up
with a solution that is correct, effective and simple in all cases.

Best,
Xuneng

Attachment Content-Type Size
v10-0001-Optimize-transaction-waiting-during-logical-deco.patch application/x-patch 15.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-08-29 07:05:14 Re: Improve LWLock tranche name visibility across backends
Previous Message Pavel Stehule 2025-08-29 07:03:30 Re: proposal: schema variables