RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2023-12-22 09:41:13
Message-ID: OS0PR01MB5716BA8406F2EE72A6EA106B9494A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, December 21, 2023 5:39 PM Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> On Thu, Dec 21, 2023 at 02:23:12AM +0000, Zhijie Hou (Fujitsu) wrote:
> > On Wednesday, December 20, 2023 8:42 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > Attach the V51 patch set which addressed Kuroda-san's comments.
> > > I also tried to improve the test in 0003 to make it stable.
> >
> > The patches conflict with a recent commit dc21234.
> > Here is the rebased V51_2 version, there is no code changes in this version.
> >
>
> Thanks!
>
> I've a few remarks regarding 0001:

Thanks for the comments!

>
> 1 ===
>
> In the commit message what about replacing "Allow logical walsenders to wait
> for the physical standbys" with "Force some logical walsenders to wait for the
> physical standbys"?

I feel 'Allow' is OK, as the GUC standby_slot_names is optional for user. ISTM, 'force'
means we always wait for physical standbys regardless of the GUC.

>
> Also I think it would be better to first explain what we are trying to achieve and
> after explain how we do it (adding a new flag in CREATE SUBSCRIPTION and so
> on).

Noted. We are about to split the patches, so will improve each commit message after that.

>
> 4 ===
>
> @@ -248,10 +262,13 @@ ReplicationSlotValidateName(const char *name, int
> elevel)
> * during getting changes, if the two_phase option is enabled it can skip
> * prepare because by that time start decoding point has been moved. So
> the
> * user will only get commit prepared.
> + * failover: If enabled, allows the slot to be synced to physical standbys so
> + * that logical replication can be resumed after failover.
>
> s/allows/forces ?

I think whether the slot is synced also depends on the
GUC setting on standby, so I feel 'allow' is fine here.

>
> 5 ===
>
> + bool ok;
>
> parse_ok maybe?

The flag is also used to store the slot type check result, so I feel 'ok' is
better here.

>
> 6 ===
>
> + /* Need a modifiable copy of string. */
> + rawname = pstrdup(*newval);
>
> It seems to me that the single line comments in the neighborhood functions
> (see
> RestoreSlotFromDisk() for example) don't finish with ".". Worth to follow the
> same format for all what we add in slot.c?

I felt we have both styles in slot.c, but it seems Kuroda-san also
prefer removing the ".", so will address.

>
> 7 ===
>
> +static void
> +parseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
>
> ParseAlterReplSlotOptions instead?

I think it followed parseCreateReplSlotOptions, but I agree that it looks
inconsistent with other names. Will address.

> 11 ===
>
> + * When the wait event is WAIT_FOR_STANDBY_CONFIRMATION, wait on
> another
> + * CV that is woken up by physical walsenders when the walreceiver has
> + * confirmed the receipt of LSN.
>
> s/that is woken up by/that is broadcasted by/ ?

Will reword the comment here.

>
> 12 ===
>
> We are mentioning in several places that the replication can be resumed after a
> failover. Should we add a few words about possible lag? (see [1])
>
> [1]:
> https://www.postgresql.org/message-id/CAA4eK1KihniOK21mEVYtSOHRQiG
> NyToUmENWp7hPbH_PMsqzkA%40mail.gmail.com

It feels like the implementation detail to me, but noted. We will think more
about the document.

The comments not mentioned above look good to me.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2023-12-22 09:44:20 Re: [meson] expose buildtype debug/optimization info to pg_config
Previous Message Sébastien Lardière 2023-12-22 09:17:45 Re: planner chooses incremental but not the best one