| From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
|---|---|
| To: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
| Cc: | suyu(dot)cmj <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, bharath(dot)rupireddyforpostgres <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com> |
| Subject: | RE: Newly created replication slot may be invalidated by checkpoint |
| Date: | 2025-12-09 02:06:42 |
| Message-ID: | TY4PR01MB169071C25B2288ECD746AA65094A3A@TY4PR01MB16907.jpnprd01.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tuesday, December 9, 2025 12:40 AM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
>
> On Monday, December 08, 2025 13:24 MSK, "Zhijie Hou (Fujitsu)"
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> > On Monday, December 8, 2025 5:47 PM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > Sawada-san/Vitaly, do you have any opinion on patch or the
> > > > > direction to fix? The idea is to get this fixed for HEAD and 18,
> > > > > then continue discussion for other bank-branches and the remaining
> patches.
>
> Hi Amit, Zhijie Hou
>
> Thank you for preparing and comiting 0001 patch. I'm ok with it. I did some
> auto testing of the patch and haven't found any problems. As I realized,
> another two patches (0002, 0003) are still in review.
Thanks for testing!
>
> In my previous email I wrote about copy_replication_slot, where restart_lsn is
> assigned without any locks, but I'm not sure that email was successfully
> delivered. Masahiko Sagawa mentioned about it in one of the latest emails as
> well. I also read the answer but not completely understood it at the moment,
> sorry (need some more time to investigate). Anyway, I would prefer to use
> locks in create_physical_replication_slot rather than rely on signals handling
> which may be changed in the future.
If we want to improve that, taking lock when updating restart_lsn does not work,
because the initial restart_lsn is an old position copied from another slot,
where the WALs could have already been removed, so unlike the mechanism
mentioned in 006dd4b2, the lock cannot ensure the same. I think we might need
to some other solutions for it which can be discussed separately.
>
> One more thing, when we copy a logical replication slot,
> DecodingContextFindStartpoint reads the WAL from the specified restart_lsn
> which may be removed by a concurrent checkpoint. It can produce an error
> and stop slot copying, I guess. This behaviour may be not desirable.
Given that we don't search for a starting point for copied slots (we
pass find_startpoint=false when copying), I don't think this issue exists.
Best Regards,
Hou zj
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-12-09 03:38:53 | Re: [Proposal] Adding callback support for custom statistics kinds |
| Previous Message | Tom Lane | 2025-12-09 01:30:55 | Re: Let's add a test for NLS translation of PRI* macros |