pg_receivewal unable to connect to promoted standby

From: RKN Sai Krishna <rknsaiforpostgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: pg_receivewal unable to connect to promoted standby
Date: 2022-06-24 12:23:59
Message-ID: CAMVpbFO54bQ1gZMoazSrxyRbya+acA77GzuakJHdeeGOf_uHsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'm trying to have a setup where there is a primary, standby and
pg_receivewal (which acts as a server that maintains the entire WAL).
Quorum is any one of standby and pg_receivewal. In case of primary crash,
when I promote standby (timeline switch from 5 to 6) and restart
pg_receivewal to connect to the promoted standby, I get an error saying
"pg_receivewal: could not send replication command "START_REPLICATION":
ERROR: requested starting point 16/4C000000 on timeline 5 is not in this
server's history. This server's history forked from timeline 5 at
16/4BFFF268".

pg_receivewal latest lsn is 16/4BFFF268 with the timeline id being 5.

Just wondering why is the pg_receivewal requesting the new primary with the
starting point as 16/4C000000, even though the latest lsn is 16/4BFFF268.

Is that because of the following code snippet in pg_receivewal by any
chance?

/*
* Move the starting pointer to the start of the next segment, if the
* highest one we saw was completed. Otherwise start streaming from
* the beginning of the .partial segment.
*/
if (!high_ispartial)
high_segno++;

If it is because of the above code, Can we let the pg_receivewal request
the new primary to provide WAL from forked lsn (by asking primary what the
forked lsn and the corresponding timeline are)?

Thanks,
RKN

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2022-06-24 12:30:31 Re: CREATE TABLE ( .. STORAGE ..)
Previous Message Robert Haas 2022-06-24 12:21:56 Re: NAMEDATALEN increase because of non-latin languages