Re: Switching XLog source from archive to streaming when primary available

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
Subject: Re: Switching XLog source from archive to streaming when primary available
Date: 2023-01-17 14:14:52
Message-ID: CALj2ACXifCmsgqG4PqwXTR8BJ-opvxa18NPKHO98z=1awSjWmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 12, 2023 at 6:21 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Tue, Oct 18, 2022 at 07:31:33AM +0530, Bharath Rupireddy wrote:
> > In summary, the standby state machine in WaitForWALToBecomeAvailable()
> > exhausts all the WAL in pg_wal before switching to streaming after
> > failing to fetch from archive. The v8 patch proposed upthread deviates
> > from this behaviour. Hence, attaching v9 patch that keeps the
> > behaviour as-is, that means, the standby exhausts all the WAL in
> > pg_wal before switching to streaming after fetching WAL from archive
> > for at least streaming_replication_retry_interval milliseconds.
>
> I think this is okay. The following comment explains why archives are
> preferred over existing files in pg_wal:
>
> * When doing archive recovery, we always prefer an archived log file even
> * if a file of the same name exists in XLOGDIR. The reason is that the
> * file in XLOGDIR could be an old, un-filled or partly-filled version
> * that was copied and restored as part of backing up $PGDATA.
>
> With your patch, we might replay one of these "old" files in pg_wal instead
> of the complete version of the file from the archives,

That's true even today, without the patch, no? We're not changing the
existing behaviour of the state machine. Can you explain how it
happens with the patch?

On HEAD, after failing to read from the archive, exhaust all wal from
pg_wal and then switch to streaming mode. With the patch, after
reading from the archive for at least
streaming_replication_retry_interval milliseconds, exhaust all wal
from pg_wal and then switch to streaming mode.

> but I think that is
> still correct. We'll just replay whatever exists in pg_wal (which may be
> un-filled or partly-filled) before attempting streaming. If that fails,
> we'll go back to trying the archives again.
>
> Would you mind testing this scenario?

How about something like below for testing the above scenario? If it
looks okay, I can add it as a new TAP test file.

1. Generate WAL files f1 and f2 and archive them.
2. Check the replay lsn and WAL file name on the standby, when it
replays upto f2, stop the standby.
3. Set recovery to fail on the standby, and stop the standby.
4. Generate f3, f4 (partially filled) on the primary.
5. Manually copy f3, f4 to the standby's pg_wal.
6. Start the standby, since recovery is set to fail, and there're new
WAL files (f3, f4) under its pg_wal, it must replay those WAL files
(check the replay lsn and WAL file name, it must be f4) before
switching to streaming.
7. Generate f5 on the primary.
8. The standby should receive f5 and replay it (check the replay lsn
and WAL file name, it must be f5).
9. Set streaming to fail on the standby and set recovery to succeed.
10. Generate f6 on the primary.
11. The standby should receive f6 via archive and replay it (check the
replay lsn and WAL file name, it must be f6).

If needed, we can look out for these messages to confirm it works as expected:
elog(DEBUG2, "switched WAL source from %s to %s after %s",
xlogSourceNames[oldSource], xlogSourceNames[currentSource],
lastSourceFailed ? "failure" : "success");
ereport(LOG,
(errmsg("restored log file \"%s\" from archive",
xlogfname)));

Essentially, it covers what the documentation
https://www.postgresql.org/docs/devel/warm-standby.html says:

"In standby mode, the server continuously applies WAL received from
the primary server. The standby server can read WAL from a WAL archive
(see restore_command) or directly from the primary over a TCP
connection (streaming replication). The standby server will also
attempt to restore any WAL found in the standby cluster's pg_wal
directory. That typically happens after a server restart, when the
standby replays again WAL that was streamed from the primary before
the restart, but you can also manually copy files to pg_wal at any
time to have them replayed."

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dag Lem 2023-01-17 14:18:16 Re: daitch_mokotoff module
Previous Message Ilaria Battiston 2023-01-17 13:59:48 GSoC 2023