asynchronous commit&synchronous replication

From: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: asynchronous commit&synchronous replication
Date: 2024-08-10 12:25:04
Message-ID: 09c4ab4c-fcfb-4236-a8f3-62c726586bef@garret.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Logical replication apply worker by default switches off asynchronous
commit. Cite from documentation of subscription parameters:

```

|synchronous_commit|(|enum|)<https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-PARAMS-WITH-SYNCHRONOUS-COMMIT>

The value of this parameter overrides thesynchronous_commit
<https://www.postgresql.org/docs/devel/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT>setting
within this subscription's apply worker processes. The default value
is|off|.
It is safe to use|off|for logical replication: If the subscriber
loses transactions because of missing synchronization, the data will
be sent again from the publisher.

```

So subscriber can confirm transaction which are not persisted. But
consider a PostgreSQL HA setup with:

* primary node
* (cold) standby node streaming WAL from the primary
* synchronous replication enabled, so that you get zero data loss if
the primary dies
* the primary/standby cluster is a subscriber to a remote PostgreSQL
server

It can happen that:

* the primary streams some transactions from the remote PostgreSQL,
with logical replication
* the primary crashes. Failover to the standby happens
* the standby tries to stream the transactions from the subscriber.
But some transactions are missed, because the primary had already
reported a higher flush LSN.

I wonder if such scenario is considered as an "expected behavior" or
"bug" by community?
It seems to be quite easily fixed (see attached patch).

So should we take in account sync replication in LR apply worker or not?

Thanks to Heikki Linnakangas <hlinnaka(at)iki(dot)fi> for describing this
scenario and Arseny Sher <ars(at)neon(dot)tech> for providing the patch.

Attachment Content-Type Size
sync_replication.patch text/plain 1.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2024-08-10 12:26:01 Re: Constant Splitting/Refactoring
Previous Message cca5507 2024-08-10 10:07:30 Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state