RE: Newly created replication slot may be invalidated by checkpoint

From: "Vitaly Davydov" <v(dot)davydov(at)postgrespro(dot)ru>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, amit(dot)kapila16(at)gmail(dot)com
Cc: suyu(dot)cmj <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, "tomas" <tomas(at)vondra(dot)me>, "michael" <michael(at)paquier(dot)xyz>, bharath(dot)rupireddyforpostgres <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Alexander Korotkov" <aekorotkov(at)gmail(dot)com>, "Masahiko Sawada" <sawada(dot)mshk(at)gmail(dot)com>
Subject: RE: Newly created replication slot may be invalidated by checkpoint
Date: 2025-12-08 16:39:46
Message-ID: bb636322-3b0e-8404-5348-215c50c9ae1f@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, December 08, 2025 13:24 MSK, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> wrote:

> On Monday, December 8, 2025 5:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > Sawada-san/Vitaly, do you have any opinion on patch or the direction
> > > > to fix? The idea is to get this fixed for HEAD and 18, then continue
> > > > discussion for other bank-branches and the remaining patches.

Hi Amit, Zhijie Hou

Thank you for preparing and comiting 0001 patch. I'm ok with it. I did some auto
testing of the patch and haven't found any problems. As I realized, another two
patches (0002, 0003) are still in review.

In my previous email I wrote about copy_replication_slot, where restart_lsn is
assigned without any locks, but I'm not sure that email was successfully
delivered. Masahiko Sagawa mentioned about it in one of the latest emails as
well. I also read the answer but not completely understood it at the moment,
sorry (need some more time to investigate). Anyway, I would prefer to use locks
in create_physical_replication_slot rather than rely on signals handling which
may be changed in the future.

One more thing, when we copy a logical replication slot,
DecodingContextFindStartpoint reads the WAL from the specified restart_lsn which
may be removed by a concurrent checkpoint. It can produce an error and stop slot
copying, I guess. This behaviour may be not desirable.

With best regards,
Vitaly

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-12-08 16:48:35 Re: [PATCH] Allow complex data for GUC extra.
Previous Message Tom Lane 2025-12-08 16:09:55 Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL