Re: pg_receivewal starting position

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: pg_receivewal starting position
Date: 2021-10-21 05:35:08
Message-ID: YXD8DHlf/cB1db5D@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 20, 2021 at 02:58:26PM +0200, Ronan Dunklau wrote:
> After sending the previous patch suite, I figured it would be worthwhile to
> also have tests covering timeline switches, which was not covered before.

That seems independent to me. I'll take a look.

> So please find attached a new version with an additional patch for those tests,
> covering both "resume from last know archive" and "resume from the
> replication slots position" cases.

So, taking things in order, I have looked at 0003 and 0001, and
attached are refined versions for both of them.

0003 is an existing hole in the docs, which I think we had better
address first and backpatch, taking into account that the starting
point calculation considers compressed segments when looking for
completed segments.

Regarding 0001, I have found the last test to check for NULL values
returned by READ_REPLICATION_SLOT after dropping the slot overlaps
with the first test, so I have removed that. I have expanded a bit
the use of like(), and there were some confusion with
PostgresNode::psql and some extra arguments (see DROP_REPLICATION_SLOT
and CREATE_REPLICATION_SLOT, and no need for return values in the
CREATE case either). Some comments, docs and code have been slightly
tweaked.

Here are some comments about 0002.

+ /* The commpand should always return precisely one tuple */
s/commpand/command/

+ pg_log_error("could not fetch replication slot: got %d rows and %d fields, expected %d rows and %d or more fields",
+ PQntuples(res), PQnfields(res), 1, 3);
Should this be "could not read" instead?

+ if (sscanf(PQgetvalue(res, 0, 1), "%X/%X", &hi, &lo) != 2)
+ {
+ pg_log_error("could not parse slot's restart_lsn \"%s\"",
+ PQgetvalue(res, 0, 1));
+ PQclear(res);
+ return false;
+ }
Wouldn't it be saner to initialize *restart_lsn and *restart_tli to
some default values at the top of GetSlotInformation() instead, if
they are specified by the caller? And I think that we should still
complain even if restart_lsn is NULL.

On a quick read of 0004, I find the split of the logic with
change_timeline() a bit hard to understand. It looks like we should
be able to make a cleaner split, but I am not sure how that would
look, though.
--
Michael

Attachment Content-Type Size
v9-0001-doc-Describe-calculation-method-of-streaming-star.patch text/x-diff 1.8 KB
v9-0002-Add-READ_REPLICATION_SLOT.patch text/x-diff 12.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-10-21 05:41:16 Re: ThisTimeLineID can be used uninitialized
Previous Message Bharath Rupireddy 2021-10-21 05:21:17 Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?