Add max_wal_replay_size connection parameter to libpq

From: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Add max_wal_replay_size connection parameter to libpq
Date: 2026-03-29 17:56:25
Message-ID: 126eb1e4-d98e-4647-b629-517adbcad28e@uni-muenster.de
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When connecting with target_session_attrs=standby (or prefer-standby,
read-only, any) and multiple standbys are available, libpq currently
selects the first acceptable candidate without regard for how "current"
its data is. A standby configured with recovery_min_apply_delay,
experiencing slow I/O, or otherwise lagging is treated the same as one
that is fully caught up.

I would like to propose a new libpq connection parameter,
max_wal_replay_size, that allows clients to skip standby servers whose
WAL replay backlog exceeds a given threshold.

Example:

psql "host=host1,host2,host3 port=5111,5222,5333 \
target_session_attrs=standby max_wal_replay_size=16MB"

When this parameter is set, libpq executes a small query during
connection establishment to evaluate:

pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn())

on the standby. If the result exceeds the specified threshold, the
server is skipped and the next host in the list is tried. The check is
skipped entirely when target_session_attrs is set to primary or
read-write, since those modes already exclude standbys.

If pg_last_wal_receive_lsn() is NULL (e.g. no active WAL receiver due to
missing primary_conninfo or a disconnected upstream), the backlog cannot
be determined. In that case, the standby is treated as exceeding the
threshold and is skipped.

This parameter measures only the apply lag on the standby itself, i.e.,
how much already-received WAL remains to be replayed. It does not
attempt to measure how far the standby is behind the primary. In
particular, a standby that is slow to receive WAL but fast to replay it
may report a small backlog here while still being significantly behind.

The attached PoC patch may make the behaviour clearer.

Any feedback on this approach would be appreciated.

Best, Jim

Attachment Content-Type Size
v1-0001-Add-libpq-connection-parameter-max_wal_replay_siz.patch text/x-patch 25.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-29 18:09:31 Re: pg_restore documentation and --create/--single-transaction limitation
Previous Message Melanie Plageman 2026-03-29 17:16:39 Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)