Re: Synchronizing slots from primary to standby

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-12-21 09:39:10
Message-ID: ZYQHvgBpH0GgQaJK@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Dec 21, 2023 at 02:23:12AM +0000, Zhijie Hou (Fujitsu) wrote:
> On Wednesday, December 20, 2023 8:42 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Attach the V51 patch set which addressed Kuroda-san's comments.
> > I also tried to improve the test in 0003 to make it stable.
>
> The patches conflict with a recent commit dc21234.
> Here is the rebased V51_2 version, there is no code changes in this version.
>

Thanks!

I've a few remarks regarding 0001:

1 ===

In the commit message what about replacing "Allow logical walsenders to wait for
the physical standbys" with "Force some logical walsenders to wait for the
physical standbys"?

Also I think it would be better to first explain what we are trying to achieve
and after explain how we do it (adding a new flag in CREATE SUBSCRIPTION and so
on).

2 ===

+ <listitem>
+ <para>
+ List of physical replication slots that logical replication slots with
+ failover enabled waits for.

Worth to add a few words about what we are actually waiting for?

3 ===

+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("could not alter replication slot \"%s\" on publisher: %s",
+ slotname, pchomp(PQerrorMessage(conn->streamConn)))));

should we mention "on publisher" here, what about removing the word "publisher"?

4 ===

@@ -248,10 +262,13 @@ ReplicationSlotValidateName(const char *name, int elevel)
* during getting changes, if the two_phase option is enabled it can skip
* prepare because by that time start decoding point has been moved. So the
* user will only get commit prepared.
+ * failover: If enabled, allows the slot to be synced to physical standbys so
+ * that logical replication can be resumed after failover.

s/allows/forces ?

5 ===

+ bool ok;

parse_ok maybe?

6 ===

+ /* Need a modifiable copy of string. */
+ rawname = pstrdup(*newval);

It seems to me that the single line comments in the neighborhood functions (see
RestoreSlotFromDisk() for example) don't finish with ".". Worth to follow the
same format for all what we add in slot.c?

7 ===

+static void
+parseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)

ParseAlterReplSlotOptions instead?

8 ===

+ * We do not need to change the failover to false if the server
+ * does not support failover (e.g. pre-PG17)

Missing "." at the end.

9 ===

+ * See comments above for twophasestate, same holds true for
+ * 'failover'

Missing "." at the end.

10 ===

+++ b/src/include/replication/walsender.h
@@ -12,6 +12,8 @@
#ifndef _WALSENDER_H
#define _WALSENDER_H

+#include "access/xlogdefs.h"

Is this include needed?

11 ===

+ * When the wait event is WAIT_FOR_STANDBY_CONFIRMATION, wait on another
+ * CV that is woken up by physical walsenders when the walreceiver has
+ * confirmed the receipt of LSN.

s/that is woken up by/that is broadcasted by/ ?

12 ===

We are mentioning in several places that the replication can be resumed after a
failover. Should we add a few words about possible lag? (see [1])

[1]: https://www.postgresql.org/message-id/CAA4eK1KihniOK21mEVYtSOHRQiGNyToUmENWp7hPbH_PMsqzkA%40mail.gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-12-21 09:40:15 Re: Track in pg_replication_slots the reason why slots conflict?
Previous Message shveta malik 2023-12-21 09:36:06 Re: Synchronizing slots from primary to standby