Re: pg_receivewal starting position

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: pg_receivewal starting position
Date: 2021-10-21 06:29:54
Message-ID: 5336539.LvFx2qVVIh@aivenronan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le jeudi 21 octobre 2021, 07:35:08 CEST Michael Paquier a écrit :
> On Wed, Oct 20, 2021 at 02:58:26PM +0200, Ronan Dunklau wrote:
> > After sending the previous patch suite, I figured it would be worthwhile
> > to
> > also have tests covering timeline switches, which was not covered before.
>
> That seems independent to me. I'll take a look.
>
> > So please find attached a new version with an additional patch for those
> > tests, covering both "resume from last know archive" and "resume from
> > the replication slots position" cases.
>
> So, taking things in order, I have looked at 0003 and 0001, and
> attached are refined versions for both of them.
>
> 0003 is an existing hole in the docs, which I think we had better
> address first and backpatch, taking into account that the starting
> point calculation considers compressed segments when looking for
> completed segments.

Ok, do you want me to propose a different patch for previous versions ?

>
> Regarding 0001, I have found the last test to check for NULL values
> returned by READ_REPLICATION_SLOT after dropping the slot overlaps
> with the first test, so I have removed that. I have expanded a bit
> the use of like(), and there were some confusion with
> PostgresNode::psql and some extra arguments (see DROP_REPLICATION_SLOT
> and CREATE_REPLICATION_SLOT, and no need for return values in the
> CREATE case either). Some comments, docs and code have been slightly
> tweaked.

Thank you for this.

>
> Here are some comments about 0002.
>
> + /* The commpand should always return precisely one tuple */
> s/commpand/command/
>
> + pg_log_error("could not fetch replication slot: got %d rows and %d
> fields, expected %d rows and %d or more fields", +
> PQntuples(res), PQnfields(res), 1, 3);
> Should this be "could not read" instead?
>
> + if (sscanf(PQgetvalue(res, 0, 1), "%X/%X", &hi, &lo) != 2)
> + {
> + pg_log_error("could not parse slot's restart_lsn \"%s\"",
> + PQgetvalue(res, 0, 1));
> + PQclear(res);
> + return false;
> + }
> Wouldn't it be saner to initialize *restart_lsn and *restart_tli to
> some default values at the top of GetSlotInformation() instead, if
> they are specified by the caller?

Ok.

> And I think that we should still
> complain even if restart_lsn is NULL.

Do you mean restart_lsn as the pointer argument to the function, or
restart_lsn as the field returned by the command ? If it's the first, I'll
change it but if it's the latter it is expected that we sometime run this on a
slot where WAL has never been reserved yet.

>
> On a quick read of 0004, I find the split of the logic with
> change_timeline() a bit hard to understand. It looks like we should
> be able to make a cleaner split, but I am not sure how that would
> look, though.

Thanks, at least if the proposal to test this seems sensible I can move
forward. I wanted to avoid having a lot of code duplication since the test
setup is a bit more complicated.
My first approach was to split it into two functions, setup_standby and
change_timeline, but then realized that what would happen between the two
invocations would basically be the same for the two test cases, so I ended up
with that patch. I'll try to see if I can see a better way of organizing that
code.

--
Ronan Dunklau

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-10-21 06:44:34 Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?
Previous Message Dilip Kumar 2021-10-21 06:04:39 Re: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber