| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
| Cc: | Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, John H <johnhyvr(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Introduce XID age based replication slot invalidation |
| Date: | 2026-07-02 23:17:44 |
| Message-ID: | CAD21AoDmNdKrbLJJvU8f5v=6LPJ50DUoQP+ONG5oicD+m-hGUw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Apr 15, 2026 at 10:03 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Mon, Apr 6, 2026 at 10:42 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Mon, Apr 6, 2026 at 1:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > > I took a look at the v10 patch and it LGTM. I tested it - make
> > > > check-world passes, pgindent doesn't complain.
> > >
> > > While reviewing the patch, I found that with this patch, backend
> > > processes and autovacuum workers can simultaneously attempt to
> > > invalidate the same slot for the same reason. When invalidating a
> > > slot, we send a signal to the process owning the slot and wait for it
> > > to exit and release the slot. If the process takes a long time to exit
> > > for some reason, subsequent autovacuum workers attempting to
> > > invalidate the same slot will also send a SIGTERM and get stuck at
> > > InvalidatePossiblyObsoleteSlot(). In the worst case, this could result
> > > in all autovacuum activity being blocked. I think we need to address
> > > this problem.
> >
> > Thank you!
> >
> > You're right that multiple autovacuum workers can wait on the same
> > slot for SIGTERM to take effect on the process (mainly walsenders)
> > holding the slot. Once the process holding the slot exits, one worker
> > finishes the invalidation and the others see it's done and move on.
> >
> > However, IMHO, this is unlikely to be a problem in practice.
> >
> > First, SIGTERM must take a long time to terminate the process holding
> > the slot. This seems unlikely unless I'm missing some cases.
> >
> > Second, the slot's xmin must be very old (past XID age) while the
> > process is still running but slow to exit. If we set max_slot_xid_age
> > close to vacuum_failsafe_age (e.g., 1.6 billion. I've added this note
> > in the docs), it seems unlikely that the replication connection would
> > still be active at that point.
> >
> > Also, concurrent invalidation can already happen today between the
> > startup process and checkpointer on standby.
> >
> > If needed, we could add a flag to skip extra invalidation attempts
> > based on field experience. Since this feature is off by default, I'd
> > prefer to keep things simple, but I'm open to other approaches.
> >
> > Thoughts?
>
> Thank you Sawada-san. I've been thinking more about it and I agree we
> need to address this. While I still think the scenario is unlikely in
> practice (SIGTERM would have to take a long time, the slot's xmin
> would have to be very old while the walsender is still running, etc.),
> I think it's worth handling.
>
> I can think of a couple of approaches:
>
> 1. Use ConditionVariableTimedSleep instead of ConditionVariableSleep
> when called from an autovacuum worker. Workers don't block forever,
> but they still wait for the timeout duration, still send redundant
> SIGTERMs, and a correct timeout value needs to be chosen. When it
> expires, the worker either retries (still stuck) or gives up (same as
> approach 2).
>
> 2. Make the vacuum path non-blocking when another process is already
> invalidating the same slot. The first process to attempt invalidation
> proceeds normally: it sends SIGTERM and waits on
> ConditionVariableSleep for the process holding the slot to exit. But
> if a subsequent autovacuum worker finds that another process has
> already initiated invalidation of this slot, it skips the slot and
> proceeds with vacuum instead of waiting on the same
> ConditionVariableSleep.
>
> I think approach 2 is simple. If another process is already
> invalidating the slot, there's no reason for the autovacuum worker to
> also block. The tradeoff is that this vacuum cycle's OldestXmin won't
> move forward and it will need another cycle for this relation. But
> that's fine given that the scenario as explained above is unlikely to
> happen in practice.
>
> Please let me know if my thinking sounds reasonable. I'm open to other
> ideas too.
The third idea I came up with is that (auto)vacuum behaves differently
in terms of XID-aged slot invalidation depending on the slot being
used or not; (auto)vacuum invalidate the XID-aged slot if no one is
holding the slot, and it just wakes up the checkpointer to invalidate
the slot if a process is still holding the slot. If the XID-aged slot
is not held by any process, (auto)vacuum simply invalidates the slot.
I believe that while the former case happens in most cases in
practice, delegating the checkpointer to invalidate XID-aged slots
might help avoid vacuum from being blocked.
What do you think about the above idea?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-07-02 23:21:01 | Re: GetBufferDescriptor() being called for local buffers from MarkBufferDirtyHint() |
| Previous Message | Jacob Champion | 2026-07-02 22:35:59 | Re: First draft of PG 19 release notes |