[PATCH] Add cascade synchronous replication

From: Григорий Новиков <grigoriy(dot)novikov220(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: [PATCH] Add cascade synchronous replication
Date: 2025-11-12 13:05:18
Message-ID: CAGap3NssOVUhx5U8V3xSnDKaR_BDaHvJq1P=_VL0G6QKsPZQvA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

Introduction
Using a large number of synchronous standbys creates excessive load on the
primary node. To solve this problem, cascading synchronous replication can
be used.

Overview of Changes
This patch adds synchronous cascading replication mechanics to PostgreSQL.
With it, standby servers will consider configuration parameters related to
synchronous replication. They will select walsenders LSN positions from
walsdender data structures and compute the synchronous LSN position for
write, flush, and apply among them using the synchronous replication
algorithm, then calculate the minimum value between these values and the
corresponding positions of the standby server. To avoid synchronization
problems and unnecessary overhead, these calculations are performed by the
walreceiver process. The offset positions will be transmitted in the
standby reply message instead of the server's own positions. This will
occur if the SyncRepRequested condition is met and if at least one
synchronous standby server is specified in synchronous_standby_names.
In case the walsender processes fail to calculate synchronous LSN values
(for example, because there are not enough synchronous standbys), the
server will send DefaultSendingLSN. This value is between InvalidXLogRecPtr
and FirstNormalUnloggedLSN. Sending InvalidXLogRecPtr is not allowed
because in the pg_stat_replication function, a standby sending such value
will be displayed as asynchronous, although it is not. The value 2 was
chosen for DefaultSendingLSN since 1 is used by one of the access methods.
When receiving a DefaultSendingLSN position value from a synchronous
standby, the server will use it as a regular LSN. This allows transaction
execution to continue if the configuration permits it. If not, transaction
execution stops until the cluster failure is resolved.

Overview of Individual Patch Parts
The first part adds the SyncRepGetSendingSyncRecPtr function, which is
written similarly to SyncRepGetSyncRecPtr and is responsible for
calculating the LSN positions to be sent. These functions contained a large
common code section, which was moved to the
SyncRepGetSyncRecPtrBySyncRepMethod function. Also, for optimization
purposes, the walsender process serving a synchronous standby can call the
WalRcvForceReply function.
The second part of the patch is responsible for redistributing code in the
syncrep.c file into sections. This is necessary to preserve the semantics
of the sections used in this file, since now some functions can be used by
the walreceiver process, while others can be used by both walreceiver and
walsender.
The third part adds a special notation in pg_stat_replication for standbys
sending DefaultSendingLSN. If such a standby is synchronous, it is marked
with a "?" symbol. In the author's opinion, this notation can simplify
problem searching in the cluster, but does not claim to be a serious
solution for failure detection.
The fourth part of the patch contains fixes in recovery tests numbered 9
and 12. These tests created circular dependencies between servers. This was
not a problem as long as standby ignored synchronous replication
parameters, but with this patch the tests broke. Also, tests for the new
mechanics were added to test 7, which is responsible for synchronous
replication.

Possible Topologies
As part of the patch, connection of asynchronous and synchronous standbys
to a synchronous standby is allowed. However, offset positions sent by
asynchronous standbys will not be considered, since the synchronous
replication algorithm is used. For the same reason, connecting a
synchronous standby to an asynchronous one is theoretically possible but
meaningless.

Additional Information
The patch contains no platform-dependent elements, compiles with the -Wall
flag, and successfully passes tests. Performance optimization is a separate
task, and in the author's opinion, deserves a separate patch. Nevertheless,
local testing using Docker containers showed insignificant performance
degradation when using synchronous cascading chains.
This patch is intended primarily for discussion. It was developed for the
master branch, commit hash: b227b0bb4e032e19b3679bedac820eba3ac0d1cf.
Best wishes, Grigoriy Novikov!

Attachment Content-Type Size
v1-0001-Cascade-sync-rep.patch application/x-patch 12.0 KB
v1-0002-Refactor-syncrep.patch application/x-patch 15.3 KB
v1-0003-Pg-stat-replication.patch application/x-patch 2.5 KB
v1-0004-Fix-recovery-tap-tests.patch application/x-patch 26.3 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Sugamoto Shinya 2025-11-12 13:23:34 Re: [PATCH] Add hints for invalid binary encoding names in encode/decode functions
Previous Message Heikki Linnakangas 2025-11-12 13:00:02 Re: POC: make mxidoff 64 bits