Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-10-31 09:24:53
Message-ID: CAJpy0uB5oxzP0S+_Mr2M0Fuoa5u0uvYpYfy=aODp3kjGWY0YNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 31, 2023 at 11:21 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Oct 31, 2023 at 7:16 AM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > > > 2.
> > > > Currently ctx->failover is set only in the pgoutput_startup(), but not sure it is
> > > OK.
> > > > Can we change the parameter in CreateDecodingContext() or similar functions?
> > > >
> > > > Because IIUC it means that only slots which have pgoutput can wait. Other
> > > > output plugins must understand the change and set faliover flag as well -
> > > > I felt it is not good. E.g., you might miss to enable the parameter in
> > > test_decoding.
> > > >
> > > > Regarding the two_phase parameter, setting on plugin layer is good because it
> > > > quite affects the output. As for the failover, it is not related with the
> > > > content so that all of slots should be enabled.
> > > >
> > > > I think CreateDecodingContext or StartupDecodingContext() is the common
> > > path.
> > > > Or, is it the out-of-scope for now?
> > >
> > > Currently, the failover field is part of the options list in the
> > > StartReplicationCmd. This gives some
> > > level of flexibility such that only plugins that are interested in
> > > this need to handle it. The options list
> > > is only deparsed by plugins. If we move it to outside of the options list,
> > > this sort of changes the protocol for START_REPLICATION and will
> > > impact all plugins.
> > > But I agree to your larger point that, we need to do it in such a way that
> > > other plugins do not unintentionally change the 'failover' behaviour
> > > of the originally created slot.
> > > Maybe I can code it in such a way that, only if the failover option is
> > > specified in the list of options
> > > passed as part of START_REPLICATION will it change the original slot
> > > created 'failover' flag by adding
> > > another flag "failover_opt_given". Plugins that set this, will be able
> > > to change the failover flag of the slot,
> > > while plugins that do not support this will not set this and the
> > > failover flag of the created slot will remain.
> > > What do you think?
> >
> > May be OK, but I came up with a corner case that external plugins have a streaming
> > option 'failover'. What should be? Has the option been reserved?
> >
>
> Sorry, your question is not clear to me. Did you intend to say that
> the value of the existing streaming option could be 'failover'?
>
> --
> With Regards,
> Amit Kapila.

PFA v27 patch-set which has below changes:

1) Enhanced WalSndWait to replace ConditionVariableSleep on
WalSndCtl->wal_confirm_rcv_cv as per suggestion in [1].
2) WalSndWaitForWal and WalSndWaitForStandbyConfirmation is now
integrated as per suggestion in [2]. WalSndWait is invoked only once.
3) Optimized slot-creation algorithm on standby as per suggestion in
[3]. Now, during the first attempt of slots-creation we create all
active slots and add inactive ones to the pending list and then we
wait on them in the second attempt.
4) Added basic tests for failover slots.

Changes for 1 and 2 are in patch001 and for 3 and 4 are in patch002.

Thanks Hou-San for implementing changes for 1 and 2. Thanks Ajin for
implementing failover tests/4.

[1]: https://www.postgresql.org/message-id/f3228cfb-7bf3-4bd8-8f37-c55fc4054759%40gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1J49j5ew-Tk4Ygv0nbjurJz12kZtqjHLALFuL03NBZdsg%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CAA4eK1KBL0110gamQfc62X%3D5JV8-Qjd0dw0Mq0o07cq6kE%2Bq%3Dg%40mail.gmail.com

thanks
Shveta

Attachment Content-Type Size
v27-0001-Allow-logical-walsenders-to-wait-for-the-physica.patch application/octet-stream 122.3 KB
v27-0002-Add-logical-slot-sync-capability-to-the-physical.patch application/octet-stream 116.4 KB
v27-0003-Allow-slot-sync-workers-to-wait-for-the-cascadin.patch application/octet-stream 7.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2023-10-31 09:27:11 Re: Synchronizing slots from primary to standby
Previous Message 邱宇航 2023-10-31 09:22:50 Simplify xlogreader.c with XLogRec* macros