RE: Newly created replication slot may be invalidated by checkpoint

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, suyu(dot)cmj <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, bharath(dot)rupireddyforpostgres <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: RE: Newly created replication slot may be invalidated by checkpoint
Date: 2026-01-23 02:03:16
Message-ID: TY4PR01MB169074400DA851B425BA5D4219494A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, January 22, 2026 2:54 PM Kuroda, Hayato/黒田 隼人 <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Thanks for updating the patch. Further comments.

Thanks for the comments.

>
> 01.
> ```
> +#include "access/xlog.h"
> ```
>
> I could build without the inclusion because "replication/logical.h" already
> includes it. Can we remove or we should retain?

Removed.

>
> 02.
> Should we increase checkpoint_timeout for stabilizing tests?

I think we don't need this as concurrent checkpoint won't
cause the slot to be invalidated.

>
> 03.
> To confirm, you've removed the logic that checks the oldest segment and try
> reserving, but it can be removed same as ReplicationSlotReserveWal(), right?

If you meant we can remove the retry logic similar to what 3510ebe did, the
understanding is correct.

> XLogGetOldestSegno() is also not needed anymore because race can't happen
> if standby have never discarded.

Yes

>
> 04.
> ```
> $primary->psql('postgres',
> q{SELECT pg_create_logical_replication_slot('failover_slot',
> 'test_decoding', false, false, true);
> SELECT pg_create_physical_replication_slot('phys_slot');}
> );
> ...
> $primary->psql('postgres', "CHECKPOINT"); ```
>
> I found two lines use `psql()`, but should be `safe_psql()`.

Changed.

>
> 05.
> Per my tests, the issue exists till PG17 and your patch can be backpatched till
> it, right?

This patch cannot be applied cleanly on backbranches, I can prepare patches for
those once the main patch is stable.

Best Regards,
Hou zj

Attachment Content-Type Size
v3-0002-Add-a-tap-test-using-injection-point.patch application/octet-stream 4.5 KB
v3-0001-Fix-race-conditions-causing-invalidation-of-newly.patch application/octet-stream 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2026-01-23 02:09:27 Re: UPDATE run check constraints for affected columns only
Previous Message Andres Freund 2026-01-23 01:18:21 Re: More speedups for tuple deformation