RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2024-03-28 04:38:19
Message-ID: OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When analyzing one BF error[1], we find an issue of slotsync: Since we don't
perform logical decoding for the synced slots when syncing the lsn/xmin of
slot, no logical snapshots will be serialized to disk. So, when user starts to
use these synced slots after promotion, it needs to re-build the consistent
snapshot from the restart_lsn if the WAL(xl_running_xacts) at restart_lsn
position indicates that there are running transactions. This however could
cause the data that before the consistent point to be missed[2].

This issue doesn't exist on the primary because the snapshot at restart_lsn
should have been serialized to disk (SnapBuildProcessRunningXacts ->
SnapBuildSerialize), so even if the logical decoding restarts, it can find
consistent snapshot immediately at restart_lsn.

To fix this, we could use the fast forward logical decoding to advance the synced
slot's lsn/xmin when syncing these values instead of directly updating the
slot's info. This way, the snapshot will be serialized to disk when decoding.
If we could not reach to the consistent point at the remote restart_lsn, the
slot is marked as temp and will be persisted once it reaches the consistent
point. I am still analyzing the fix and will share once ready.

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2024-03-19%2010%3A03%3A06
[2] The steps to reproduce the data miss issue on a primary->standby setup:

Note, we need to set LOG_SNAPSHOT_INTERVAL_MS to a bigger number(1500000) to
prevent cocurrent LogStandbySnapshot() call and enable sync_replication_slots on the standby.

1. Create a failover logical slot on the primary.
SELECT 'init' FROM pg_create_logical_replication_slot('logicalslot', 'test_decoding', false, false, true);

2. Use the following steps to advance the restart_lsn of the failover slot to a
position where the xl_running_xacts at that position indicates that there is
running transaction.

TXN1
BEGIN;
create table dummy1(a int);

TXN2
SELECT pg_log_standby_snapshot();

TXN1
COMMIT;

TXN1
BEGIN;
create table dummy2(a int);

TXN2
SELECT pg_log_standby_snapshot();

TXN1
COMMIT;

-- the restart_lsn will be advanced to a position where there was 1 running
transaction. And we need to wait for the restart_lsn to be synced to the
standby.
SELECT pg_replication_slot_advance('logicalslot', pg_current_wal_lsn());

-- insert some data here before calling next pg_log_standby_snapshot().
INSERT INTO reptable VALUES(999);

3. Promote the standby and try to consume the change(999) from the synced slot
on the standby. We will find that no change is returned.

select * from pg_logical_slot_get_changes('logicalslot', NULL, NULL);

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aadhav Vignesh 2024-03-28 04:53:19 Re: Proposal: Introduce row-level security templates
Previous Message shveta malik 2024-03-28 03:46:17 Re: Introduce XID age and inactive timeout based replication slot invalidation