Re: Question about InvalidatePossiblyObsoleteSlot()

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about InvalidatePossiblyObsoleteSlot()
Date: 2025-10-28 08:58:03
Message-ID: aQCFm2LeKOu1lRs0@ip-10-97-1-34.eu-west-3.compute.internal
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Oct 27, 2025 at 10:22:32AM -0700, Masahiko Sawada wrote:
> On Thu, Oct 23, 2025 at 3:07 AM Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > Done in the attached.
>
> +1 to change the behavior to pre-818fefd8fd4.

Thanks for looking at it!

> Here are some review
> comments for the v2 patch:
>
> - if (initial_restart_lsn != InvalidXLogRecPtr &&
> - initial_restart_lsn < oldestLSN)
> + XLogRecPtr restart_lsn = s->data.restart_lsn;
> +
> + if (restart_lsn != InvalidXLogRecPtr &&
> + restart_lsn < oldestLSN)
>
> I would suggest using XLogRecPtrIsInvalid() for better readability.

Yeah and I think that should be done consistently, see the proposal in [1].

> ---
> /*
> - * The slot's mutex will be released soon, and it is possible that
> - * those values change since the process holding the slot has been
> - * terminated (if any), so record them here to ensure that we
> - * would report the correct invalidation cause.
> - *
> * Unlike other slot attributes, slot's inactive_since can't be
> * changed until the acquired slot is released or the owning
> * process is terminated. So, the inactive slot can only be
> * invalidated immediately without being terminated.
> */
> - if (!terminated)
> - {
> - initial_restart_lsn = s->data.restart_lsn;
> - initial_effective_xmin = s->effective_xmin;
> - initial_catalog_effective_xmin = s->effective_catalog_xmin;
> - }
>
> invalidation_cause = DetermineSlotInvalidationCause(possible_causes,
> s, oldestLSN,
> dboid,
>
> After deleting the first paragraph of the comments, the comment starts
> with "Unlike other slot attributes.."

Yeah, it has been added in ac0e33136abc.

> seems no longer match what this
> block of codes do.

Agree.

> It needs to be updated or moved to a more
> appropriate place.

What about moving it after?

"
* If the slot can be acquired, do so and mark it invalidated
* immediately. Otherwise we'll signal the owning process, below, and
* retry."

That looks like a good place to me.

>
> >
> > > and fix
> > > the test case which relies on different messages.
> >
> > I think that 105b2cb3361 fixed it already or do you have something else in
> > mind?
>
> IIUC the test instability issue was caused by the fact that
> RUNNING_XACTS record is decoded and catalog_xmin gets advanced between
> terminating the slot owner process and invalidating the slot. The
> issue was fixed by 105b2cb3361 by stopping the writing of
> RUNNING_XACTS by injection points.

Right.

> 818fefd8fd4 was meant to fix the
> inconsistency that the reason of process termination could be
> different from the reason of the subsequent slot invalidation.

Right.

> Therefore, the test instability issue is already cleared,

Yes.

> but I think
> we might want to do something to deal with the inconsistency that we
> originally wanted to address.

I see, you mean that the tests are stable now (thanks to 105b2cb3361) but
that we should still do something for "production" cases? (i.e not making use
of injection points).

[1]: https://www.postgresql.org/message-id/flat/aQB7EvGqrbZXrMlg(at)ip-10-97-1-34(dot)eu-west-3(dot)compute(dot)internal

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-10-28 09:20:39 Re: [PATCH] Check that index can return in get_actual_variable_range()
Previous Message Kirill Reshke 2025-10-28 08:57:55 Re: on_error table, saving error info to a table