Re: Switching XLog source from archive to streaming when primary available

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
Subject: Re: Switching XLog source from archive to streaming when primary available
Date: 2023-01-21 05:43:42
Message-ID: CALj2ACVB5eVTpj6YLkTqE1+mBGhyVLu9SdJg9CAj595rW5XSFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 19, 2023 at 6:20 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Tue, Jan 17, 2023 at 07:44:52PM +0530, Bharath Rupireddy wrote:
> > On Thu, Jan 12, 2023 at 6:21 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> >> With your patch, we might replay one of these "old" files in pg_wal instead
> >> of the complete version of the file from the archives,
> >
> > That's true even today, without the patch, no? We're not changing the
> > existing behaviour of the state machine. Can you explain how it
> > happens with the patch?
>
> My point is that on HEAD, we will always prefer a complete archive file.
> With your patch, we might instead choose to replay an old file in pg_wal
> because we are artificially advancing the state machine. IOW even if
> there's a complete archive available, we might not use it. This is a
> behavior change, but I think it is okay.

Oh, yeah, I too agree that it's okay because manually copying WAL
files directly to pg_wal (which eventually get replayed before
switching to streaming) isn't recommended anyway for production level
servers. I think, we covered it in the documentation that it exhausts
all the WAL present in pg_wal before switching. Isn't that enough?

+ Specifies amount of time after which standby attempts to switch WAL
+ source from WAL archive to streaming replication (get WAL from
+ primary). However, exhaust all the WAL present in pg_wal before
+ switching. If the standby fails to switch to stream mode, it falls
+ back to archive mode.

> >> Would you mind testing this scenario?
ndby should receive f6 via archive and replay it (check the
> > replay lsn an> >
>
> I meant testing the scenario where there's an old file in pg_wal, a
> complete file in the archives, and your new GUC forces replay of the
> former. This might be difficult to do in a TAP test. Ultimately, I just
> want to validate the assumptions discussed above.

I think testing the scenario [1] is achievable. I could write a TAP
test for it - https://github.com/BRupireddy/postgres/tree/prefer_archived_wal_v1.
It's a bit flaky and needs a little more work (1 - writing a custom
script for restore_command that sleeps only after fetching an
existing WAL file from archive, not sleeping for a history file or a
non-existent WAL file. 2- finding a command-line way to sleep on
Windows.) to stabilize it, but it seems doable. I can spend some more
time, if one thinks that the test is worth adding to the core, perhaps
discussing it separately from this thread.

[1] RestoreArchivedFile():
/*
* When doing archive recovery, we always prefer an archived log file even
* if a file of the same name exists in XLOGDIR. The reason is that the
* file in XLOGDIR could be an old, un-filled or partly-filled version
* that was copied and restored as part of backing up $PGDATA.
*

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2023-01-21 11:03:08 Re: Supporting MERGE on updatable views
Previous Message Drouvot, Bertrand 2023-01-21 05:42:51 Re: Split index and table statistics into different types of stats