Re: Question about InvalidatePossiblyObsoleteSlot()

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "amit(dot)kapila16" <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about InvalidatePossiblyObsoleteSlot()
Date: 2025-10-17 22:08:07
Message-ID: CAD21AoD6zM1E8E1rrJe96vc9x+Kuwd8AdF4zh-76dZWTsip3DQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 17, 2025 at 12:18 AM Bertrand Drouvot
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Thu, Oct 16, 2025 at 11:32:03AM -0700, Masahiko Sawada wrote:
> > On Thu, Oct 16, 2025 at 1:23 AM Bertrand Drouvot
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >
> > > I'm not sure. The existing logging shows when we invalidate a slot, so the
> > > absence of that message indicates we skipped it. Users can also verify the
> > > slot status in the view if needed.
> > >
> >
> > I'm somewhat confused. As the commit message of 818fefd8fd4 says, the
> > invalidation of an active slot is done in two steps:
> >
> > - Termination of the backend holding it, if any.
> > - Report that the slot is obsolete, with a conflict cause depending on
> > the slot's data.
> >
> > Before commit 818fefd8fd4, it was possible that if slot's
> > effective_xmin, catalog_effective_xmin, or restart_lsn gets advanced
> > between two steps, we ended up skipping the actual invalidation for
> > the slot.
>
> If advancing far enough to reach RS_INVAL_NONE, right.
>
> > What the patch proposed on this thread does is that we allow such pre-
> > 818fefd8fd4 behavior only for restart_lsn. That is, we skip the actual
> > slot invalidation if the slot's restart_lsn gets advanced and exceeds
> > the oldestLSN between two steps.
>
> Exactly.
>
> > I thought the reason why we moved away from pre-818fefd8fd4 was that
> > we wanted to avoid reporting only "terminated" and to report both
> > things consistently. But if we can accept pre-818fefd8fd4 for
> > restart_lsn, why can we not do the same for effective_xmin and
> > catalog_effective_xmin?
> > I guess that the same is true also for
> > effective_xmin and catalog_effective_xmin that we can skip the actual
> > slot invalidation if those values get advanced and exceed
> > snapshotConflictHorizon.
>
> I think this is safe to do for the restart_lsn because that's the same process
> (checkpointer/startup) that is executing InvalidatePossiblyObsoleteSlot()
> and will remove the WALs. It can not remove the WALs as long as it is busy in
> InvalidatePossiblyObsoleteSlot() so that I think this is safe for the restart_lsn
> case.
>
> I'm not that confident for the xmin cases, do you think that's also safe to not
> invalidate the slots for effective_xmin and catalog_effective_xmin if they
> advance far enough?

I find the same in xmin cases. ResolveRecoveryConflictWithSnapshot()
is called only during the recovery by the startup process, and it also
tries to invalidate possibly obsolete slots. Catalog tuples are not
removed as long as the startup calls
ResolveRecoveryConflictWithSnapshot() before actually removing the
tuples and it's busy in InvalidatePossiblyObsoleteSlot(). I might be
missing something but this is the reason why I'm confused with the
818fefd8fd4 fix and the proposed change.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-10-17 22:23:01 Re: new environment variable INITDB_LOCALE_PROVIDER
Previous Message Jeff Davis 2025-10-17 22:02:16 Re: Change initdb default to the builtin collation provider