Re: Synchronizing slots from primary to standby

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-04-04 09:36:54
Message-ID: CAA4eK1KfRzxoT+JrR4ejbLAU-nucGpMbEoDg8v+oNrfFc4Zeyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 4, 2024 at 1:55 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> While testing this change, I realized that it could happen that the
> server logs are flooded with the following logical decoding logs that
> are written every 200 ms:
>
> 2024-04-04 16:15:19.270 JST [3838739] LOG: starting logical decoding
> for slot "test_sub"
...
...
>
> For example, I could reproduce it with the following steps:
>
> 1. create the primary and start.
> 2. run "pgbench -i -s 100" on the primary.
> 3. run pg_basebackup to create the standby.
> 4. configure slotsync setup on the standby and start.
> 5. create a publication for all tables on the primary.
> 6. create the subscriber and start.
> 7. run "pgbench -i -Idtpf" on the subscriber.
> 8. create a subscription on the subscriber (initial data copy will start).
>
> The logical decoding logs were written every 200 ms during the initial
> data synchronization.
>
> Looking at the new changes for update_local_synced_slot():
>
> if (remote_slot->confirmed_lsn != slot->data.confirmed_flush ||
> remote_slot->restart_lsn != slot->data.restart_lsn ||
> remote_slot->catalog_xmin != slot->data.catalog_xmin)
> {
> /*
> * We can't directly copy the remote slot's LSN or xmin unless there
> * exists a consistent snapshot at that point. Otherwise, after
> * promotion, the slots may not reach a consistent point before the
> * confirmed_flush_lsn which can lead to a data loss. To avoid data
> * loss, we let slot machinery advance the slot which ensures that
> * snapbuilder/slot statuses are updated properly.
> */
> if (SnapBuildSnapshotExists(remote_slot->restart_lsn))
> {
> /*
> * Update the slot info directly if there is a serialized snapshot
> * at the restart_lsn, as the slot can quickly reach consistency
> * at restart_lsn by restoring the snapshot.
> */
> SpinLockAcquire(&slot->mutex);
> slot->data.restart_lsn = remote_slot->restart_lsn;
> slot->data.confirmed_flush = remote_slot->confirmed_lsn;
> slot->data.catalog_xmin = remote_slot->catalog_xmin;
> slot->effective_catalog_xmin = remote_slot->catalog_xmin;
> SpinLockRelease(&slot->mutex);
>
> if (found_consistent_snapshot)
> *found_consistent_snapshot = true;
> }
> else
> {
> LogicalSlotAdvanceAndCheckSnapState(remote_slot->confirmed_lsn,
> found_consistent_snapshot);
> }
>
> ReplicationSlotsComputeRequiredXmin(false);
> ReplicationSlotsComputeRequiredLSN();
>
> slot_updated = true;
>
> We call LogicalSlotAdvanceAndCheckSnapState() if one of confirmed_lsn,
> restart_lsn, and catalog_xmin is different between the remote slot and
> the local slot. In my test case, during the initial sync performing,
> only catalog_xmin was different and there was no serialized snapshot
> at restart_lsn, and the slotsync worker called
> LogicalSlotAdvanceAndCheckSnapState(). However no slot properties were
> changed even after the function and it set slot_updated = true. So it
> starts the next slot synchronization after 200ms.
>
> It seems to me that we can skip calling
> LogicalSlotAdvanceAndCheckSnapState() at least when the remote and
> local have the same restart_lsn and confirmed_lsn.
>

I think we can do that but do we know what caused catalog_xmin to be
updated regularly without any change in restart/confirmed_flush LSN? I
think the LSNs are not updated during the initial sync (copy) time but
how catalog_xmin is getting updated for the same slot?

BTW, if we see, we will probably anyway except this xmin as it is due
to the following code in LogicalIncreaseXminForSlot()

LogicalIncreaseXminForSlot()
{
/*
* If the client has already confirmed up to this lsn, we directly can
* mark this as accepted. This can happen if we restart decoding in a
* slot.
*/
else if (current_lsn <= slot->data.confirmed_flush)
{
slot->candidate_catalog_xmin = xmin;
slot->candidate_xmin_lsn = current_lsn;

/* our candidate can directly be used */
updated_xmin = true;
}

> I'm not sure there are other scenarios but is it worth fixing this symptom?
>

I think so but let's investigate this a bit more. BTW, while thinking
on this one, I noticed that in the function
LogicalConfirmReceivedLocation(), we first update the disk copy, see
comment [1] and then in-memory whereas the same is not true in
update_local_synced_slot() for the case when snapshot exists. Now, do
we have the same risk here in case of standby? Because I think we will
use these xmins while sending the feedback message (in
XLogWalRcvSendHSFeedback()).

[1]
/*
* We have to write the changed xmin to disk *before* we change
* the in-memory value, otherwise after a crash we wouldn't know
* that some catalog tuples might have been removed already.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-04-04 09:37:39 Re: Streaming read-ready sequential scan code
Previous Message Peter Eisentraut 2024-04-04 09:33:17 Re: Silence Meson warning on HEAD