Re: [PATCH] Add archive_mode=follow_primary to prevent unarchived WAL on standby promotion

From: John H <johnhyvr(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Subject: Re: [PATCH] Add archive_mode=follow_primary to prevent unarchived WAL on standby promotion
Date: 2025-11-04 21:54:13
Message-ID: CA+-JvFur1o1Xij52ixbPhMNoDZQuQADz91x2ktW548wkzxNd=A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Oct 31, 2025 at 11:14 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> AFAIU archiver archives in order of reading archive_status directory, e.i. random order in worst case.
>

My understanding is the archiver uses a heap to allocate the batch of
files that will be archived to avoid scanning the directory
every-time. [0] The comparison is by name so it would only contain the
oldest WAL segments in order [1].

> Anyway, we could send .done signals to standby, but we cannot be sure given standby already have WAL for which we are commanding him to avoid archiving it... And standby might have these WALs from archive already, thus not needing .done file at all.
>
> So, I implemented basic design that works for worst case. We can add some heuristics on top, but them must be negligible cheap in any possible archiving scenario.
>

I was thinking at a high-level pgarch.c just has the latest WAL
segment archived from writer. Then every time before it attempts to
archive the segment in
pgarch_archiveXlog it just checks if the xlog is <
lastArchivedSegmentOnWriter. If it is earlier than the writer's
archived segment return true/skip the segment. It wouldn't matter if
the archived_segment on writer is ahead of what has been streamed to
the standby because standby archiver would only do comparisons against
what it has locally.

If writer has archived WAL 10, it should be safe for standby to skip WAL 1-9.
This way we don't need to stream every .done file from writer to
standby because we can rely on the fact that the segments are archived
in order.

[0] https://github.com/postgres/postgres/blob/master/src/backend/postmaster/pgarch.c#L739-L742
[1] https://github.com/postgres/postgres/blob/master/src/backend/postmaster/pgarch.c#L792-L797

Thanks,
--
John Hsu - Amazon Web Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-11-04 22:39:02 Re: [BUG] PostgreSQL crashes with ThreadSanitizer during early initialization
Previous Message Tomas Vondra 2025-11-04 21:21:24 Re: Adding basic NUMA awareness