Re: [Patch]: Fix excessive ProcArrayLock acquisitions with subscription max_retention_duration=0

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [Patch]: Fix excessive ProcArrayLock acquisitions with subscription max_retention_duration=0
Date: 2026-04-27 10:15:51
Message-ID: CAJpy0uCxN3t==fHuB6oPsUqJ-MHk7OTK7uJETyjSa90noe4A9w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 27, 2026 at 3:18 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
> <satyanarlapuram(at)gmail(dot)com> wrote:
> >
> > Hi Hackers,
> >
> > When a subscription has retain_dead_tuples enabled with maxretention set
> > to zero (unlimited retention), adjust_xid_advance_interval() caps
> > xid_advance_interval to Min(interval, maxretention). Since maxretention
> > is zero, this always collapses the interval to zero milliseconds.
> >
> > A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
> > true in get_candidate_xid(). This causes the apply worker to call
> > GetOldestActiveTransactionId() on every single WAL message. This results in
> > a huge number of ProcArrayLock acquisitions under moderate write load.
> >

I agree with the problem statement. I can see it in my debugging.

> > Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
> > the exponential back-off in adjust_xid_advance_interval()
> > now works correctly, growing the interval from 100 ms toward the 180 s
> > ceiling.

Yes, this should work. Let's see what others have to say on this.

> > Measured with perf uprobe counting GetOldestActiveTransactionId calls
> > at ~39K TPS (pgbench, 5 clients):
> >
> > Before fix: 25,104 calls / 5 s (~5,021/s)
> > After fix: 31 calls / 5 s (~6/s)
> >

Just curious, how did you catch this problem? Did it show up in any of
your profiling reports?

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2026-04-27 10:40:24 RE: Parallel Apply
Previous Message Peter Eisentraut 2026-04-27 10:04:54 Re: FOR PORTION OF gram.y target_location seems wrong