Re: Introduce XID age based replication slot invalidation

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, John H <johnhyvr(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Introduce XID age based replication slot invalidation
Date: 2026-04-23 18:11:14
Message-ID: CALj2ACWA=VpNhYb8VFZ_uYcs+Vxpid1ZqvYo8fM67cGq+Azt+A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Apr 15, 2026 at 10:03 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Mon, Apr 6, 2026 at 10:42 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > > I took a look at the v10 patch and it LGTM. I tested it - make
> > > > check-world passes, pgindent doesn't complain.
> > >
> > > While reviewing the patch, I found that with this patch, backend
> > > processes and autovacuum workers can simultaneously attempt to
> > > invalidate the same slot for the same reason. When invalidating a
> > > slot, we send a signal to the process owning the slot and wait for it
> > > to exit and release the slot. If the process takes a long time to exit
> > > for some reason, subsequent autovacuum workers attempting to
> > > invalidate the same slot will also send a SIGTERM and get stuck at
> > > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result
> > > in all autovacuum activity being blocked. I think we need to address
> > > this problem.
> >
> > Thank you!
> >
> > You're right that multiple autovacuum workers can wait on the same
> > slot for SIGTERM to take effect on the process (mainly walsenders)
> > holding the slot. Once the process holding the slot exits, one worker
> > finishes the invalidation and the others see it's done and move on.
> >
> > However, IMHO, this is unlikely to be a problem in practice.
> >
> > First, SIGTERM must take a long time to terminate the process holding
> > the slot. This seems unlikely unless I'm missing some cases.
> >
> > Second, the slot's xmin must be very old (past XID age) while the
> > process is still running but slow to exit. If we set max_slot_xid_age
> > close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note
> > in the docs), it seems unlikely that the replication connection would
> > still be active at that point.
> >
> > Also, concurrent invalidation can already happen today between the
> > startup process and checkpointer on standby.
> >
> > If needed, we could add a flag to skip extra invalidation attempts
> > based on field experience. Since this feature is off by default, I'd
> > prefer to keep things simple, but I'm open to other approaches.
> >
> > Thoughts?
>
> Thank you Sawada-san. I've been thinking more about it and I agree we
> need to address this. While I still think the scenario is unlikely in
> practice (SIGTERM would have to take a long time, the slot's xmin
> would have to be very old while the walsender is still running, etc.),
> I think it's worth handling.
>
> I can think of a couple of approaches:
>
> 1. Use ConditionVariableTimedSleep instead of ConditionVariableSleep
> when called from an autovacuum worker. Workers don't block forever,
> but they still wait for the timeout duration, still send redundant
> SIGTERMs, and a correct timeout value needs to be chosen. When it
> expires, the worker either retries (still stuck) or gives up (same as
> approach 2).
>
> 2. Make the vacuum path non-blocking when another process is already
> invalidating the same slot. The first process to attempt invalidation
> proceeds normally: it sends SIGTERM and waits on
> ConditionVariableSleep for the process holding the slot to exit. But
> if a subsequent autovacuum worker finds that another process has
> already initiated invalidation of this slot, it skips the slot and
> proceeds with vacuum instead of waiting on the same
> ConditionVariableSleep.
>
> I think approach 2 is simple. If another process is already
> invalidating the slot, there's no reason for the autovacuum worker to
> also block. The tradeoff is that this vacuum cycle's OldestXmin won't
> move forward and it will need another cycle for this relation. But
> that's fine given that the scenario as explained above is unlikely to
> happen in practice.
>
> Please let me know if my thinking sounds reasonable. I'm open to other
> ideas too.
>
> Thoughts?

I implemented the approach 2 (patch 0003). I added an injection point
to mimic the walsender taking time to process SIGTERM, so that the
process invalidating the slot waits on the slot's CV.

Please have a look and share your thoughts. Thank you!

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v11-0001-Introduce-max_slot_xid_age-to-invalidate-old-rep.patch application/x-patch 38.7 KB
v11-0002-Add-more-tests-for-XID-age-slot-invalidation.patch application/x-patch 6.3 KB
v11-0003-Avoid-concurrent-XID-age-slot-invalidation-attem.patch application/x-patch 12.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2026-04-23 18:17:03 Re: oauth integer overflow
Previous Message Mark Dilger 2026-04-23 17:57:28 Re: GUC parameter ACLs and physical walsender