Re: Strange decreasing value of pg_last_wal_receive_lsn()

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: godjan • <g0dj4n(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange decreasing value of pg_last_wal_receive_lsn()
Date: 2020-05-13 15:04:47
Message-ID: 20200513170447.18482c6c@firost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 11 May 2020 15:54:02 +0900
Michael Paquier <michael(at)paquier(dot)xyz> wrote:
[...]
> There are several HA solutions floating around in the community, and I
> got to wonder as well if some of them don't just scan the local
> pg_wal/ of each standby in this case, even if that's more simple to
> let the nodes start and replay up to their latest point available.

PAF relies on pg_last_wal_receive_lsn(). Relying on pg_last_wal_replay_lsn
might be possible. As you explained, it would requires to compare current
replay LSN with the last received on disk thought. This might probably be done,
eg with pg_waldump maybe and a waiting loop.

However, such a waiting loop might be dangerous. If standbys are lagging far
behind and/or have read only sessions and/or load slowing down the replay, the
waiting loop might be very long. Maybe longer than the required RTO. The HA
automatic operator might even takes curative action because of some
recovery timeout, making things worst.

Regards,

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-05-13 15:15:18 Re: PG 13 release notes, first draft
Previous Message Tom Lane 2020-05-13 15:01:47 Re: SLRU statistics