Re: Synchronizing slots from primary to standby

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-10-24 10:05:09
Message-ID: da2d3264-7049-48b1-914a-9c8631c8e384@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 10/24/23 7:44 AM, Ajin Cherian wrote:
> On Mon, Oct 23, 2023 at 11:22 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>>
>> @@ -602,6 +602,9 @@ CreateDecodingContext(XLogRecPtr start_lsn,
>> SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
>> }
>>
>> + /* set failover in the slot, as requested */
>> + slot->data.failover = ctx->failover;
>> +
>>
>> I think we can get rid of this change in CreateDecodingContext().
>>
> Yes, I too noticed this in my testing, however just removing this from
> CreateDecodingContext will not allow us to change the slot's failover flag
> using Alter subscription.

Oh right.

> I am thinking of moving this change to
> StartLogicalReplication prior to calling CreateDecodingContext by
> parsing the command options in StartReplicationCmd
> without adding it to the LogicalDecodingContext.
>

Yeah, that looks like a good place to update "failover".

Doing more testing and I have a couple of remarks about he current behavior.

1) Let's imagine that:

- there is no standby
- standby_slot_names is set to a valid slot on the primary (but due to the above, not linked to any standby)
- then a create subscription on a subscriber WITH (failover = true) would start the
synchronisation but never finish (means leaving a "synchronisation" slot like "pg_32811_sync_24576_7293415241672430356"
in place coming from ReplicationSlotNameForTablesync()).

That's expected, but maybe we should emit a warning in WalSndWaitForStandbyConfirmation() on the primary when there is
a slot part of standby_slot_names which is not active/does not have an active_pid attached to it?

2) When we create a subscription, another slot is created during the subscription synchronization, namely
like "pg_16397_sync_16388_7293447291374081805" (coming from ReplicationSlotNameForTablesync()).

This extra slot appears to have failover also set to true.

So, If the standby refresh the list of slot to sync when the subscription is still synchronizing we'd see things like
on the standby:

LOG: waiting for remote slot "mysub" LSN (0/C0034808) and catalog xmin (756) to pass local slot LSN (0/C0034840) and and catalog xmin (756)
LOG: wait over for remote slot "mysub" as its LSN (0/C00368B0) and catalog xmin (756) has now passed local slot LSN (0/C0034840) and catalog xmin (756)
LOG: waiting for remote slot "pg_16397_sync_16388_7293447291374081805" LSN (0/C0034808) and catalog xmin (756) to pass local slot LSN (0/C00368E8) and and catalog xmin (756)
WARNING: slot "pg_16397_sync_16388_7293447291374081805" disappeared from the primary, aborting slot creation

I'm not sure this "pg_16397_sync_16388_7293447291374081805" should have failover set to true. If there is a failover
during the subscription creation, better to re-launch the subscription instead?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2023-10-24 10:59:05 Re: Bug: RLS policy FOR SELECT is used to check new rows
Previous Message Alena Rybakina 2023-10-24 10:00:13 Re: Simplify create_merge_append_path a bit for clarity