RE: Newly created replication slot may be invalidated by checkpoint

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, suyu(dot)cmj <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, bharath(dot)rupireddyforpostgres <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: RE: Newly created replication slot may be invalidated by checkpoint
Date: 2026-01-26 03:03:36
Message-ID: TY4PR01MB169076CE653659F8F8BC984729493A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, January 23, 2026 5:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jan 23, 2026 at 7:33 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> wrote:
> >
> > This patch cannot be applied cleanly on backbranches, I can prepare
> > patches for those once the main patch is stable.
> >
>
> Some comments:
> 1.
> + /*
> + * Determine the minimum non-removable LSN by comparing the redo
> + pointer
> + * with the minimum slot LSN.
> + */
> + min_safe_lsn = GetRedoRecPtr();
> + slot_min_lsn = XLogGetReplicationSlotMinimumLSN();
>
> Can we expand these comments a bit to state why we need both RedoRecPtr
> and slot's minimum LSN?

Added some comments for this.

>
> 2.
> +# Verify that while syncing a slot to the standby server, if the WAL
> +before the # remote restart_lsn is at risk of being removed by a
> +checkpoint, the slot # cannot be synced. Otherwise, even if the slot
> +syncing succeeds, it may be # immediately invalidated by the checkpoint.
> +my $primary = $node;
>
> This comment atop the testcase is not very clear. Because, it is testing that
> the slot is synced and is not invalidated. How about:
> "Verify that the synchronized slots won't be invalidated immediately after
> synchronization in the presence of a concurrent checkpoint."?

Thanks, I have changed to use the suggested version.

>
> 3.
> +# Increase the log_min_messages setting to DEBUG2 on both the standby
> +and # primary to debug test failures, if any.
> +my $connstr_1 = $primary->connstr;
>
> Do we need this DEBUG2? I don't think we should add too many DEBUG2
> tests as it increases Log volume.

Removed.

Here are the V4 patches for both HEAD and back branches.

Best Regards,
Hou zj

Attachment Content-Type Size
v4-0001-Fix-race-conditions-causing-invalidation-of-newly.patch application/octet-stream 12.2 KB
v4_PG17-0001-Fix-race-conditions-causing-invalidation-of-_patch application/octet-stream 12.2 KB
v4_PG18-0001-Fix-race-conditions-causing-invalidation-of-_patch application/octet-stream 12.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ywgrit 2026-01-26 03:11:22 Some questions about JIT optimization
Previous Message David Rowley 2026-01-26 03:03:19 Re: unnecessary executor overheads around seqscans