Re: Streaming replication and WAL archive interactions

From: Grigory Smolkin <smallkeen(at)gmail(dot)com>
To: Jaroslav Novikov <njrslv(at)yandex-team(dot)ru>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: hlinnaka(at)iki(dot)fi, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Venkata Balaji N <nag1010(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Borodin Vladimir <root(at)simply(dot)name>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, nkak(at)vmware(dot)com, Roman Khapov <rkhapov(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, ShirishaRao(at)vmware(dot)com
Subject: Re: Streaming replication and WAL archive interactions
Date: 2026-05-03 22:50:12
Message-ID: 4516e7e8-46f0-4ded-907a-db5a7c0c75b3@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, hackers!
I would like to thank the community and all participants of this thread
for their interest in this problem.
In our production system with tens of thousands PostgreSQL clusters we
encounter exactly the same issue and are forced to synchronize upstreams
and downstreams via external means, which is quite suboptimal.
I`ve done some work on top of the proposed v4 version patch and would
like to present v5 version for a discussion.
There are a number of changes, such as sending just TLI and Segno
instead of full WAL filename, shifting some work into archiver and
adding shared memory for walreceiver/archiver synchronization.
There are a number of issues currently unresolved, which are worth a
discussion.

1. Should we update pg_stat_archiver on standby to support cascading
replication or should we just resend the report, received from upstream?
Personally I'm more inclined towards the pg_stat_archiver path, because
this way there will be less `if-else` programming and
archive_mode=shared behaviour will be more monitoring-friendly.

2. What should we do with *.backup.ready and *.partial.ready on standby?
Can we just XLogArchiveForceDone() them?

3. Should we keep the awkward part with switchpont calculation in
timeline switch case? I think all segments that are not in our server
history should just be stamped with XLogArchiveForceDone().

4. Currently XLogArchiveForceDone is forced either by walreceiver (on
receiving report from upstream) and archiver. Should we move this into
the archiver entirely?

Any feedback will be much appreciated.

Attachment Content-Type Size
v5_0001-Add-archive_mode-shared-for-coordinated-WAL-archiving.patch text/x-patch 39.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2026-05-03 22:50:13 Re: UPDATE/DELETE FOR PORTION OF fire FOR EACH STATEMENT more than once
Previous Message Ayush Tiwari 2026-05-03 22:23:53 Re: [PATCH] Fix column name escaping in postgres_fdw stats import