Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-01-12 12:19:57
Message-ID: CAJpy0uAwz9iNw3=5QC8S3z8Bf-w0z27rE32SFxjDMer7aTQMPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 12, 2024 at 5:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Jan 11, 2024 at 7:53 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Jan 9, 2024 at 6:39 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > +static bool
> > > +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot)
> > > {
> > > ...
> > > + /* Slot ready for sync, so sync it. */
> > > + else
> > > + {
> > > + /*
> > > + * Sanity check: With hot_standby_feedback enabled and
> > > + * invalidations handled appropriately as above, this should never
> > > + * happen.
> > > + */
> > > + if (remote_slot->restart_lsn < slot->data.restart_lsn)
> > > + elog(ERROR,
> > > + "cannot synchronize local slot \"%s\" LSN(%X/%X)"
> > > + " to remote slot's LSN(%X/%X) as synchronization"
> > > + " would move it backwards", remote_slot->name,
> > > + LSN_FORMAT_ARGS(slot->data.restart_lsn),
> > > + LSN_FORMAT_ARGS(remote_slot->restart_lsn));
> > > ...
> > > }
> > >
> > > I was thinking about the above code in the patch and as far as I can
> > > think this can only occur if the same name slot is re-created with
> > > prior restart_lsn after the existing slot is dropped. Normally, the
> > > newly created slot (with the same name) will have higher restart_lsn
> > > but one can mimic it by copying some older slot by using
> > > pg_copy_logical_replication_slot().
> > >
> > > I don't think as mentioned in comments even if hot_standby_feedback is
> > > temporarily set to off, the above shouldn't happen. It can only lead
> > > to invalidated slots on standby.
> > >
> > > To close the above race, I could think of the following ways:
> > > 1. Drop and re-create the slot.
> > > 2. Emit LOG/WARNING in this case and once remote_slot's LSN moves
> > > ahead of local_slot's LSN then we can update it; but as mentioned in
> > > your previous comment, we need to update all other fields as well. If
> > > we follow this then we probably need to have a check for catalog_xmin
> > > as well.
> > >
> >
> > The second point as mentioned is slightly misleading, so let me try to
> > rephrase it once again: Emit LOG/WARNING in this case and once
> > remote_slot's LSN moves ahead of local_slot's LSN then we can update
> > it; additionally, we need to update all other fields like two_phase as
> > well. If we follow this then we probably need to have a check for
> > catalog_xmin as well along remote_slot's restart_lsn.
> >
> > > Now, related to this the other case which needs some handling is what
> > > if the remote_slot's restart_lsn is greater than local_slot's
> > > restart_lsn but it is a re-created slot with the same name. In that
> > > case, I think the other properties like 'two_phase', 'plugin' could be
> > > different. So, is simply copying those sufficient or do we need to do
> > > something else as well?
> > >
> >
> > Bertrand, Dilip, Sawada-San, and others, please share your opinion on
> > this problem as I think it is important to handle this race condition.
>
> Is there any good use case of copying a failover slot in the first
> place? If it's not a normal use case and we can probably live without
> it, why not always disable failover during the copy? FYI we always
> disable two_phase on copied slots. It seems to me that copying a
> failover slot could lead to problems, as long as we synchronize slots
> based on their names. IIUC without the copy, this pass should never
> happen.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

There are multiple approaches discussed and tried when it comes to
starting a slot-sync worker. I am summarizing all here:

1) Make slotsync worker as an Auxiliary Process (like checkpointer,
walwriter, walreceiver etc). The benefit this approach provides is, it
can control begin and stop in a more flexible way as each auxiliary
process could have different checks before starting and can have
different stop conditions. But it needs code duplication for process
management(start, stop, crash handling, signals etc) and currently it
does not support db-connection smoothly (none of the auxiliary process
has one so far)

We attempted to make slot-sync worker as an auxiliary process and
faced some challenges. The slot sync worker needs db-connection and
thus needs InitPostgres(). But AuxiliaryProcessMain() and
InitPostgres() are not compatible as both invoke common functions and
end up setting many callbacks functions twice (with different args).
Also InitPostgres() does 'MyBackendId' initialization (which further
triggers some stuff) which is not needed for AuxiliaryProcess and so
on. And thus in order to make slot-sync worker as an auxiliary
process, we need something similar to InitPostgres (trimmed down
working version) which needs further detailed analysis.

2) Make slotsync worker as a 'special' process like AutoVacLauncher
which is neither an Auxiliary process nor a bgworker one. It allows
db-connection and also provides flexibility to have start and stop
conditions for a process. But it needs a lot of code-duplication
around start, stop, fork (windows, non-windows), crash-management and
stuff. It also needs to do many process-initialization stuff by
itself (which is otherwise done internally by Aux and bgworker infra).
And I am not sure if we should be adding a new process as a 'special'
one when postgres already provides bgworker and Auxiliary process
infrastructure.

3) Make slotysnc worker a bgworker. Here we just need to register our
process as a bgworker (RegisterBackgroundWorker()) by providing a
relevant start_time and restart_time and then the process management
is well taken care of. It does not need any code-duplication and
allows db-connection smoothly in registered process. The only thing it
lacks is that it does not provide flexibility of having
start-condition which then makes us to have 'enable_syncslot' as
PGC_POSTMASTER parameter rather than PGC_SIGHUP. Having said this, I
feel enable_syncslot is something which will not be changed frequently
and with the benefits provided by bgworker infra, it seems a
reasonably good option to choose this approach.

4) Another option is to have Logical Replication Launcher(or a new
process) to launch slot-sync worker. But going by the current design
where we have only 1 slotsync worker, it may be an overhead to have an
additional manager process maintained. Especially if we go by 'Logical
Replication Launcher', some extra changes will be needed there. It
will need start_time change from BgWorkerStart_RecoveryFinished to
BgWorkerStart_ConsistentState (doable but wanted to mention the
change). And provided the fact that 'Logical Replication Launcher'
does not have db-connection currently, in future if slotsync
validation-checks need to execute some sql query, it cannot do it
simply. It will either need the launcher to have db-connection or will
need new commands to be implemented for the same.

Thus weighing pros and cons of all these options, we have currently
implemented the bgworker approach (approach 3). Any feedback is
welcome.

Thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2024-01-12 12:35:24 Re: plpgsql memory leaks
Previous Message Nisha Moond 2024-01-12 12:07:58 Re: Improve the connection failure error messages