PostgreSQL with Patroni not replicating to all nodes after adding 3rd node (another secondary)

From: Zb B <zbig(dot)poland(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: PostgreSQL with Patroni not replicating to all nodes after adding 3rd node (another secondary)
Date: 2022-06-22 14:58:55
Message-ID: CAKwARkYUvo-tXp3EWVkUywZCVHNruooxeotLbO25g=qjJN_nww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
I am new to Patroni and PostgreSQL.We have set up a cluster with etcd (3
nodes), Patroni (2 nodes) and PostgreSQL (2 nodes) with replication from
primary to secondary.In SYNC mode. Seemed to work fine. Then I added a
third DB node without Patroni - just to replicate the data from the primary
using:
1) added another slot in patroni.yml:
slots:
bdc2b:
type: physical

2) used
pg_basebackup -v -R -h 10.17.5.211,10.17.5.83 -U replication --slot=bdc2b
-D 14/data

As a result the primary DB was showing two replication slots and the
Patroni cluster looked healthy by executing:
patronictl -c /etc/patroni/patroni.yml list

(the Leader and replica were running)

But when I started my remote test application that was executing small
insert transactions I noticed the records are replicated to the 3rd node
only (the secondary without Patroni). They are not replicated to secondary
node (the Replica with Patroni)
Some debugging using
journalctl -f
shows that the replica is not healthy and after a while the replication
slot becomes inactive. See the log below:

Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: 2022-06-22 08:06:35,280 INFO:
Got response from xyzd3riardb01 http://10.17.5.211:8008/patroni: {"state":
"running", "postmaster_start_time": "2022-06-22 05:05:37.382607-04:00",
"role": "master", "server_version": 140004, "xlog": {"location":
117558448}, "timeline": 4, "replication": [{"usename": "replication",
"application_name": "test1b", "client_addr": "10.17.5.56", "state":
"streaming", "sync_state": "async", "sync_priority": 0}, {"usename":
"replication", "application_name": "xyzd3riardb02", "client_addr":
"10.17.5.83", "state": "streaming", "sync_state": "sync", "sync_priority":
1}], "dcs_last_seen": 1655899566, "database_system_identifier":
"7111967488904966919", "patroni": {"version": "2.1.4", "scope": "test1b"}}
Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: 2022-06-22 08:06:35,375
WARNING: Master (xyzd3riardb01) is still alive
Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: server signaled
Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: 2022-06-22 08:06:35,400 INFO:
following a different leader because i am not the healthiest node
Jun 22 08:07:05 xyzd3riardb02 patroni[12495]: 2022-06-22 08:07:05,279 INFO:
Got response from xyzd3riardb01 http://10.17.5.211:8008/patroni: {"state":
"running", "postmaster_start_time": "2022-06-22 05:05:37.382607-04:00",
"role": "master", "server_version": 140004, "xlog": {"location":
117558448}, "timeline": 4, "replication": [{"usename": "replication",
"application_name": "test1b", "client_addr": "10.17.5.56", "state":
"streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen":
1655899596, "database_system_identifier": "7111967488904966919", "patroni":
{"version": "2.1.4", "scope": "test1b"}}
Jun 22 08:07:05 xyzd3riardb02 patroni[12495]: 2022-06-22 08:07:05,374
WARNING: Master (xyzd3riardb01) is still alive
Jun 22 08:07:05 xyzd3riardb02 patroni[12495]: 2022-06-22 08:07:05,393 INFO:
following a different leader because i am not the healthiest node

But the Patroni cluster still looks healthy after executing
patronictl -c /etc/patroni/patroni.yml list

while not replicating the records to the replica.
What can be the reason? Where to look for the problem?

Thanks,

Zbigniew

Browse pgsql-general by date

  From Date Subject
Next Message Nathan Bossart 2022-06-22 16:25:50 Re: Extension pg_trgm, permissions and pg_dump order
Previous Message Mahendrakar, Prabhakar - Dell Team 2022-06-22 11:24:14 RE: Postgresql error : PANIC: could not locate a valid checkpoint record