| From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, John H <johnhyvr(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Introduce XID age based replication slot invalidation |
| Date: | 2026-04-23 18:11:14 |
| Message-ID: | CALj2ACWA=VpNhYb8VFZ_uYcs+Vxpid1ZqvYo8fM67cGq+Azt+A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Wed, Apr 15, 2026 at 10:03 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Mon, Apr 6, 2026 at 10:42 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > > I took a look at the v10 patch and it LGTM. I tested it - make
> > > > check-world passes, pgindent doesn't complain.
> > >
> > > While reviewing the patch, I found that with this patch, backend
> > > processes and autovacuum workers can simultaneously attempt to
> > > invalidate the same slot for the same reason. When invalidating a
> > > slot, we send a signal to the process owning the slot and wait for it
> > > to exit and release the slot. If the process takes a long time to exit
> > > for some reason, subsequent autovacuum workers attempting to
> > > invalidate the same slot will also send a SIGTERM and get stuck at
> > > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result
> > > in all autovacuum activity being blocked. I think we need to address
> > > this problem.
> >
> > Thank you!
> >
> > You're right that multiple autovacuum workers can wait on the same
> > slot for SIGTERM to take effect on the process (mainly walsenders)
> > holding the slot. Once the process holding the slot exits, one worker
> > finishes the invalidation and the others see it's done and move on.
> >
> > However, IMHO, this is unlikely to be a problem in practice.
> >
> > First, SIGTERM must take a long time to terminate the process holding
> > the slot. This seems unlikely unless I'm missing some cases.
> >
> > Second, the slot's xmin must be very old (past XID age) while the
> > process is still running but slow to exit. If we set max_slot_xid_age
> > close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note
> > in the docs), it seems unlikely that the replication connection would
> > still be active at that point.
> >
> > Also, concurrent invalidation can already happen today between the
> > startup process and checkpointer on standby.
> >
> > If needed, we could add a flag to skip extra invalidation attempts
> > based on field experience. Since this feature is off by default, I'd
> > prefer to keep things simple, but I'm open to other approaches.
> >
> > Thoughts?
>
> Thank you Sawada-san. I've been thinking more about it and I agree we
> need to address this. While I still think the scenario is unlikely in
> practice (SIGTERM would have to take a long time, the slot's xmin
> would have to be very old while the walsender is still running, etc.),
> I think it's worth handling.
>
> I can think of a couple of approaches:
>
> 1. Use ConditionVariableTimedSleep instead of ConditionVariableSleep
> when called from an autovacuum worker. Workers don't block forever,
> but they still wait for the timeout duration, still send redundant
> SIGTERMs, and a correct timeout value needs to be chosen. When it
> expires, the worker either retries (still stuck) or gives up (same as
> approach 2).
>
> 2. Make the vacuum path non-blocking when another process is already
> invalidating the same slot. The first process to attempt invalidation
> proceeds normally: it sends SIGTERM and waits on
> ConditionVariableSleep for the process holding the slot to exit. But
> if a subsequent autovacuum worker finds that another process has
> already initiated invalidation of this slot, it skips the slot and
> proceeds with vacuum instead of waiting on the same
> ConditionVariableSleep.
>
> I think approach 2 is simple. If another process is already
> invalidating the slot, there's no reason for the autovacuum worker to
> also block. The tradeoff is that this vacuum cycle's OldestXmin won't
> move forward and it will need another cycle for this relation. But
> that's fine given that the scenario as explained above is unlikely to
> happen in practice.
>
> Please let me know if my thinking sounds reasonable. I'm open to other
> ideas too.
>
> Thoughts?
I implemented the approach 2 (patch 0003). I added an injection point
to mimic the walsender taking time to process SIGTERM, so that the
process invalidating the slot waits on the slot's CV.
Please have a look and share your thoughts. Thank you!
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size |
|---|---|---|
| v11-0001-Introduce-max_slot_xid_age-to-invalidate-old-rep.patch | application/x-patch | 38.7 KB |
| v11-0002-Add-more-tests-for-XID-age-slot-invalidation.patch | application/x-patch | 6.3 KB |
| v11-0003-Avoid-concurrent-XID-age-slot-invalidation-attem.patch | application/x-patch | 12.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Daniel Gustafsson | 2026-04-23 18:17:03 | Re: oauth integer overflow |
| Previous Message | Mark Dilger | 2026-04-23 17:57:28 | Re: GUC parameter ACLs and physical walsender |