From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Vitaly Davydov' <v(dot)davydov(at)postgrespro(dot)ru> |
Cc: | suyu(dot)cmj <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, aekorotkov <aekorotkov(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, bharath(dot)rupireddyforpostgres <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Newly created replication slot may be invalidated by checkpoint |
Date: | 2025-10-03 11:14:44 |
Message-ID: | OSCPR01MB1496618B37AF646D70C5912AEF5E4A@OSCPR01MB14966.jpnprd01.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Vitaly,
> > Would you have enough time to work on and fix the issue?
> > One idea is to compute the required LSN by the system at the slot checkpoint.
> This
> > partially follows what PG18/HEAD does but seems hacky and difficult to accept.
>
> I'm working on the issue. Give me, please, a couple of days to finalize my work.
Oh, sorry. I was rude.
> In short, I think to call ReplicationSlotsComputeRequiredLSN() right before
> slotsMinReqLSN assignment in CreateCheckPoint in 17 and earlier versions. At
> this moment, we already have a new redo lsn. I consider, that the WAL
> reservation happens when we assign restart_lsn to a slot. Taking into account
> this consideration, I distinguish two cases - WAL reservation happens before or
> after new redo ptr assignment. If we reserve the WAL after new redo ptr, it will
> protect the slot's reservation, as you've mentioned. The problem appears, when
> we reserve the WAL before a new redo ptr, but the update of
> XLogCtl->replicationSlotMinLSN was not yet hapenned. When we assign
> slotsMinReqLSN, we use XLogCtl->replicationSlotMinLSN. The call of
> ReplicationSlotsComputeRequiredLSN before slotsMinReqLSN assignment can
> help.
> It will be guaranteed, that those slots with WAL reservation before a new redo
> ptr will be protected by slotsMinReqLSN, but slots with wal reservation after
> a new redo ptr will be protected by the redo ptr. I think it is about the same
> as you proposed.
Per my understanding, this happened because there is a lag that restart_lsn of
the slot is set, and it is protected by the system. Your idea is to ensure the
restart_lsn is protected by the system before obtaining on-memory LSN, right?
> These reasonings are applied to physical slots, but it seems to be ok for
> logical slots. One moment, I'm not sure, when we create a logical slot in
> recovery. In this case, GetXLogReplayRecPtr() is used. I'm not sure, that
> redo ptr will protect such slot in CreateRestartPoint.
I considered a reproducer for the logical slot on the standby instance. Similar
with the physical one, the injection point while reserving the WAL is used, and
it would be discarded by the restartpoint command.
One difference with physical is that invalidated slot does not retain, because
it is the ephemeral at that time.
After adding the fix [1], I confirmed my testcases are passed, but we should
understand more about the standby stuff.
[1]:
```
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7675,6 +7675,7 @@ CreateRestartPoint(int flags)
MemSet(&CheckpointStats, 0, sizeof(CheckpointStats));
CheckpointStats.ckpt_start_t = GetCurrentTimestamp();
+ XLogGetReplicationSlotMinimumLSN();
```
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Attachment | Content-Type | Size |
---|---|---|
0001-Reproduce-the-slot-invalidation-on-standby.patch | application/octet-stream | 4.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nazir Bilal Yavuz | 2025-10-03 11:28:23 | Re: split func.sgml to separated individual sgml files |
Previous Message | Etsuro Fujita | 2025-10-03 10:51:05 | Fwd: Problem while updating a foreign table pointing to a partitioned table on foreign server |