| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: [Patch]: Fix excessive ProcArrayLock acquisitions with subscription max_retention_duration=0 |
| Date: | 2026-04-27 10:15:51 |
| Message-ID: | CAJpy0uCxN3t==fHuB6oPsUqJ-MHk7OTK7uJETyjSa90noe4A9w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Apr 27, 2026 at 3:18 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
> <satyanarlapuram(at)gmail(dot)com> wrote:
> >
> > Hi Hackers,
> >
> > When a subscription has retain_dead_tuples enabled with maxretention set
> > to zero (unlimited retention), adjust_xid_advance_interval() caps
> > xid_advance_interval to Min(interval, maxretention). Since maxretention
> > is zero, this always collapses the interval to zero milliseconds.
> >
> > A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
> > true in get_candidate_xid(). This causes the apply worker to call
> > GetOldestActiveTransactionId() on every single WAL message. This results in
> > a huge number of ProcArrayLock acquisitions under moderate write load.
> >
I agree with the problem statement. I can see it in my debugging.
> > Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
> > the exponential back-off in adjust_xid_advance_interval()
> > now works correctly, growing the interval from 100 ms toward the 180 s
> > ceiling.
Yes, this should work. Let's see what others have to say on this.
> > Measured with perf uprobe counting GetOldestActiveTransactionId calls
> > at ~39K TPS (pgbench, 5 clients):
> >
> > Before fix: 25,104 calls / 5 s (~5,021/s)
> > After fix: 31 calls / 5 s (~6/s)
> >
Just curious, how did you catch this problem? Did it show up in any of
your profiling reports?
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Hayato Kuroda (Fujitsu) | 2026-04-27 10:40:24 | RE: Parallel Apply |
| Previous Message | Peter Eisentraut | 2026-04-27 10:04:54 | Re: FOR PORTION OF gram.y target_location seems wrong |