Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-04-04 09:29:09
Message-ID: CAJpy0uCSS5zmdyUXhvw41HSdTbRqX1hbYqkOfHNj7qQ+2zn0AQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 3, 2024 at 3:36 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
> > <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > > I quickly looked at v8, and have a nit, rest all looks good.
> > >
> > > + if (DecodingContextReady(ctx) && found_consistent_snapshot)
> > > + *found_consistent_snapshot = true;
> > >
> > > Can the found_consistent_snapshot be checked first to help avoid the
> > > function call DecodingContextReady() for pg_replication_slot_advance
> > > callers?
> > >
> >
> > Okay, changed. Additionally, I have updated the comments and commit
> > message. I'll push this patch after some more testing.
> >
>
> Pushed!

There is an intermittent BF failure observed at [1] after this commit (2ec005b).

Analysis:
We see in BF logs, that during the first call of the sync function,
restart_lsn of the synced slot is advanced to a lsn which is > than
remote slot's restart_lsn. And when sync call is done second time
without any further change on primary, it hits the error:
ERROR: cannot synchronize local slot "lsub1_slot" LSN(0/3000060) to
remote slot's LSN(0/3000028) as synchronization would move it
backwards

Relevant BF logs are given at [2]. This reproduces intermittently
depending upon if bgwriter logs running txn record when the test is
running. We were able to mimic the test case to reproduce the failure.
Please see attached bf-test.txt for steps.

Issue:
Issue is that we are having a wrong sanity check based on
'restart_lsn' in synchronize_one_slot():

if (remote_slot->restart_lsn < slot->data.restart_lsn)
elog(ERROR, ...);

Prior to commit 2ec005b, this check was okay, as we did not expect
restart_lsn of the synced slot to be ahead of remote since we were
directly copying the lsns. But now when we use 'advance' to do logical
decoding on standby, there is a possibility that restart lsn of the
synced slot is ahead of remote slot, if there are running txns records
found after reaching consistent-point while consuming WALs from
restart_lsn till confirmed_lsn. In such a case, slot-sync's advance
may end up serializing snapshots and setting restart_lsn to the
serialized snapshot point, ahead of remote one.

Fix:
The sanity check needs to be corrected. Attached a patch to address the issue.
a) The sanity check is corrected to compare confirmed_lsn rather than
restart_lsn.
Additional changes:
b) A log has been added after LogicalSlotAdvanceAndCheckSnapState() to
log the case when the local and remote slots' confirmed-lsn were not
found to be the same after sync (if at all).
c) Now we attempt to sync in update_local_synced_slot() if one of
confirmed_lsn, restart_lsn, and catalog_xmin for remote slot is ahead
of local slot instead of them just being unequal.

[1]:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-04-03%2017%3A57%3A28

[2]:
2024-04-03 18:00:41.896 UTC [3239617][client backend][0/2:0] LOG:
statement: SELECT pg_sync_replication_slots();
LOG: starting logical decoding for slot "lsub1_slot"
DETAIL: Streaming transactions committing after 0/0, reading WAL from
0/3000028.
LOG: logical decoding found consistent point at 0/3000028
DEBUG: serializing snapshot to pg_logical/snapshots/0-3000060.snap
DEBUG: got new restart lsn 0/3000060 at 0/3000060
LOG: newly created slot "lsub1_slot" is sync-ready now

2024-04-03 18:00:45.218 UTC [3243487][client backend][2/2:0] LOG:
statement: SELECT pg_sync_replication_slots();
ERROR: cannot synchronize local slot "lsub1_slot" LSN(0/3000060) to
remote slot's LSN(0/3000028) as synchronization would move it
backwards

thanks
Shveta

Attachment Content-Type Size
bf-test.txt text/plain 722 bytes
v1-0001-Correct-sanity-check-to-compare-confirmed_lsn.patch application/octet-stream 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-04-04 09:31:57 Re: Streaming read-ready sequential scan code
Previous Message Bertrand Drouvot 2024-04-04 09:28:36 Re: Autogenerate some wait events code and documentation