Re: Synchronizing slots from primary to standby

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-01-23 04:14:42
Message-ID: CAHut+Ps6p6Km8_Hfy6X0KTuyqBKkhC84u23sQnnkhqkHuDL+DQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for v65-0002

======
0. General - GUCs in messages

I think it would be better for the GUC names to all be quoted. It's
not a rule (yet), but OTOH it seems to be the consensus most people
want. See [1].

This might impact the following messages:

0.1
+ ereport(ERROR,
+ errmsg("could not fetch primary_slot_name \"%s\" info from the"
+ " primary server: %s", PrimarySlotName, res->err));

SUGGESTION
errmsg("could not fetch primary server slot \"%s\" info from the
primary server: %s", ...)
errhint("Check if \"primary_slot_name\" is configured correctly.");

~~~

0.2
+ if (!tuplestore_gettupleslot(res->tuplestore, true, false, tupslot))
+ elog(ERROR, "failed to fetch primary_slot_name tuple");

SUGGESTION
elog(ERROR, "failed to fetch tuple for the primary server slot
specified by \"primary_slot_name\"");

~~~

0.3
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ /* translator: second %s is a GUC variable name */
+ errdetail("The primary server slot \"%s\" specified by %s is not valid.",
+ PrimarySlotName, "primary_slot_name"));

/specified by %s/specified by \"%s\"/

~~~

0.4
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("%s must be defined.", "primary_slot_name"));

/%s must be defined./\"%s\" must be defined./

~~~

0.5
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("%s must be enabled.", "hot_standby_feedback"));

/%s must be enabled./\"%s\" must be enabled./

~~~

0.6
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("wal_level must be >= logical."));

errhint("\"wal_level\" must be >= logical."))

~~~

0.7
+ if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0)
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("%s must be defined.", "primary_conninfo"));

/%s must be defined./\"%s\" must be defined./

~~~

0.8
+ ereport(ERROR,
+
+ /*
+ * translator: 'dbname' is a specific option; %s is a GUC variable
+ * name
+ */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("'dbname' must be specified in %s.", "primary_conninfo"));

/must be specified in %s./must be specified in \"%s\"./

~~~

0.9
+ ereport(LOG,
+ errmsg("skipping slot synchronization"),
+ errdetail("enable_syncslot is disabled."));

errdetail("\"enable_syncslot\" is disabled."));

======
src/backend/replication/logical/slotsync.c

1.
+/* Min and Max sleep time for slot sync worker */
+#define MIN_WORKER_NAPTIME_MS 200
+#define MAX_WORKER_NAPTIME_MS 30000 /* 30s */
+
+/*
+ * Sleep time in ms between slot-sync cycles.
+ * See wait_for_slot_activity() for how we adjust this
+ */
+static long sleep_ms = MIN_WORKER_NAPTIME_MS;

These all belong together, so I think they share a combined comment like:

SUGGESTION
The sleep time (ms) between slot-sync cycles varies dynamically
(within a MIN/MAX range) according to slot activity. See
wait_for_slot_activity() for details.

~~~

2. update_and_persist_slot

+ /* First time slot update, the function must return true */
+ if(!local_slot_update(remote_slot))
+ elog(ERROR, "failed to update slot");

Missing whitespace after 'if'

~~~

3. synchronize_one_slot

+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on standby",
+ remote_slot->name),
+ errdetail("A user-created slot with the same name as"
+ " failover slot already exists on the standby."));

3a.
/on standby/on the standby/

~

3b.
Now the errmsg is changed, the errdetail doesn't seem so useful. Isn't
it repeating pretty much the same information as in the errmsg?

======
src/backend/replication/walsender.c

4. GetStandbyFlushRecPtr

/*
- * Returns the latest point in WAL that has been safely flushed to disk, and
- * can be sent to the standby. This should only be called when in recovery,
- * ie. we're streaming to a cascaded standby.
+ * Returns the latest point in WAL that has been safely flushed to disk.
+ * This should only be called when in recovery.
+ *

Since it says "This should only be called when in recovery", should
there also be a check for that (e.g. RecoveryInProgress) in the added
Assert?

======
src/include/replication/walreceiver.h

5.
typedef char *(*walrcv_identify_system_fn) (WalReceiverConn *conn,
TimeLineID *primary_tli);
+/*
+ * walrcv_get_dbname_from_conninfo_fn
+ *
+ * Returns the dbid from the primary_conninfo
+ */
+typedef char *(*walrcv_get_dbname_from_conninfo_fn) (const char *conninfo);

It looks like a blank line that previously existed has been lost.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPsf3NewXbsFKY88Qn1ON1_dMD6343MuWdMiiM2Ds9a_wA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2024-01-23 04:15:21 Re: In-placre persistance change of a relation
Previous Message Kyotaro Horiguchi 2024-01-23 04:13:56 Re: Make mesage at end-of-recovery less scary.