RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2024-04-10 11:58:37
Message-ID: OS0PR01MB57162B67D3CB01B2756FBA6D94062@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Thursday, April 4, 2024 5:37 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> BTW, while thinking on this one, I
> noticed that in the function LogicalConfirmReceivedLocation(), we first update
> the disk copy, see comment [1] and then in-memory whereas the same is not
> true in
> update_local_synced_slot() for the case when snapshot exists. Now, do we have
> the same risk here in case of standby? Because I think we will use these xmins
> while sending the feedback message (in XLogWalRcvSendHSFeedback()).
>
> * We have to write the changed xmin to disk *before* we change
> * the in-memory value, otherwise after a crash we wouldn't know
> * that some catalog tuples might have been removed already.

Yes, I think we have the risk on the standby, I can reproduce the case that if
the server crashes after updating the in-memory value and before saving them to
disk, the synced slot could be invalidated after restarting from crash, because
the necessary rows have been removed on the primary. The steps can be found in
[1].

I think we'd better fix the order in update_local_synced_slot() as well. I
tried to make the fix in 0002, 0001 is Shveta's patch to fix another issue in this thread. Since
they are touching the same function, so attach them together for review.

[1]
-- Primary:
SELECT 'init' FROM pg_create_logical_replication_slot('logicalslot', 'test_decoding', false, false, true);

-- Standby:
SELECT 'init' FROM pg_create_logical_replication_slot('standbylogicalslot', 'test_decoding', false, false, false);
SELECT pg_sync_replication_slots();

-- Primary:
CREATE TABLE test (a int);
INSERT INTO test VALUES(1);
DROP TABLE test;

SELECT txid_current();
SELECT txid_current();
SELECT txid_current();
SELECT pg_log_standby_snapshot();

SELECT pg_replication_slot_advance('logicalslot', pg_current_wal_lsn());

-- Standby:
- wait for standby to replay all the changes on the primary.

- this is to serialize snapshots.
SELECT pg_replication_slot_advance('standbylogicalslot', pg_last_wal_replay_lsn());

- Use gdb to stop at the place after calling ReplicationSlotsComputexx()
functions and before calling ReplicationSlotSave().
SELECT pg_sync_replication_slots();

-- Primary:

- First, wait for the primary slot(the physical slot)'s catalog xmin to be
updated to the same as the failover slot.

VACUUM FULL;

- Wait for VACUMM FULL to be replayed on standby.

-- Standby:

- For the process which is blocked by gdb, let the process crash (elog(PANIC,
...)).

After restarting the standby from crash, we can see the synced slot is invalidated.

LOG: invalidating obsolete replication slot "logicalslot"
DETAIL: The slot conflicted with xid horizon 741.
CONTEXT: WAL redo at 0/3059B90 for Heap2/PRUNE_ON_ACCESS: snapshotConflictHorizon: 741, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 7, nunused: 0, dead: [22, 23, 24, 25, 26, 27, 28]; blkref #0: rel 1663/5/1249, blk 16

Best Regards,
Hou zj

Attachment Content-Type Size
v4-0002-write-the-changed-xmin-to-disk-before-computing-g.patch application/octet-stream 3.7 KB
v4-0001-Correct-sanity-check-to-compare-confirmed_lsn.patch application/octet-stream 5.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-04-10 12:00:00 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Previous Message Etsuro Fujita 2024-04-10 11:51:34 Comment about handling of asynchronous requests in postgres_fdw.c