Re: pg_receivewal starting position

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: pg_receivewal starting position
Date: 2021-10-25 09:10:05
Message-ID: CALj2ACWW2E4drNd5FzVBBoR-_5jUWUvBeoy7LUOkny6sxaRs0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 25, 2021 at 12:21 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Attached is the updated patch I am finishing with, which is rather
> clean now. I have tweaked a couple of things while on it, and
> documented better the new GetSlotInformation() in streamutil.c.

Thanks for the v11 patch, here are some comments:

1) Remove the extra whitespace in between "t" and ":"
+ pg_log_error("could not read replication slot : %s",

2) I think we should tweak the below error message
+ pg_log_error("could not read replication slot \"%s\": %s",
slot_name, PQerrorMessage(conn));
to
pg_log_error("could not read replication slot \"%s\": %s",
"IDENTIFY_SYSTEM", PQerrorMessage(conn));
Having slot name in the error message helps to isolate the error
message from tons of server logs that gets generated.

3) Also for the same reasons stated as above, change the below error message
pg_log_error("could not read replication slot: got %d rows and %d
fields, expected %d rows and %d or more fields",
to
pg_log_error("could not read replication slot \"%s\": got %d rows and
%d fields, expected %d rows and %d or more fields", slot_name,....

4) Also for the same reasons, change below
+ pg_log_error("could not parse slot's restart_lsn \"%s\"",
to
pg_log_error("could not parse replicaton slot \"%s\" restart_lsn \"%s\"",
slot_name, PQgetvalue(res, 0, 1));

5) I think we should also have assertion for the timeline id:
Assert(stream.startpos != InvalidXLogRecPtr);
Assert(stream.timeline!= 0);

6) Why do we need these two assignements?
+ if (*restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli != NULL)
+ *restart_tli = tli_loc;

+ /* Assign results if requested */
+ if (restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli)
+ *restart_tli = tli_loc;

I think we can just get rid of lsn_loc and tli_loc, initlaize
*restart_lsn = InvalidXLogRecPtr and *restart_tli = 0 at the start of
the function and directly assign the requrested values to *restart_lsn
and *restart_tli, also see comment (8).

7) Let's be consistent, change the following
+
+ if (*restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli != NULL)
+ *restart_tli = tli_loc;
to
+
+ if (restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli != NULL)
+ *restart_tli = tli_loc;

8) Let's extract the values asked by the caller, change:
+ /* restart LSN */
+ if (!PQgetisnull(res, 0, 1))

+ /* current TLI */
+ if (!PQgetisnull(res, 0, 2))
+ tli_loc = (TimeLineID) atol(PQgetvalue(res, 0, 2));

to

+ /* restart LSN */
+ if (restart_lsn && !PQgetisnull(res, 0, 1))

+ /* current TLI */
+ if (restart_tli && !PQgetisnull(res, 0, 2))

9) 80char limit crossed:
+GetSlotInformation(PGconn *conn, const char *slot_name, XLogRecPtr
*restart_lsn, TimeLineID *restart_tli)

10) Missing word "command", and use "issued to the server", so change the below:
+ <command>READ_REPLICATION_SLOT</command> is issued to retrieve the
to
+ <command>READ_REPLICATION_SLOT</command> command is issued to
the server to retrieve the

11) Will replication_slot ever be NULL? If it ever be null, then we
don't reach this far right? We see the pg_log_error("%s needs a slot
to be specified using --slot". Please revmove below if condition:
+ * server may not support this option.
+ */
+ if (replication_slot != NULL)

We can just add Assert(slot_name); in GetSlotInformation().

12) How about following:
"If a starting point cannot be calculated with the previous method,
<command>READ_REPLICATION_SLOT</command> command with the provided
slot is issued to the server for retreving the slot's restart_lsn and
timelineid"
instead of
+ and if a replication slot is used, an extra
+ <command>READ_REPLICATION_SLOT</command> is issued to retrieve the
+ slot's <literal>restart_lsn</literal> to use as starting point.

The IDENTIFY_SYSTEM descritption starts with "If a starting point
cannot be calculated....":
<listitem>
<para>
If a starting point cannot be calculated with the previous method,
the latest WAL flush location is used as reported by the server from
a <literal>IDENTIFY_SYSTEM</literal> command.

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2021-10-25 09:12:28 Re: TAP test for recovery_end_command
Previous Message Michael Paquier 2021-10-25 08:57:49 Re: pg_receivewal starting position