Three questions about Postgres Logical Replication

From: Qinghui QH2 Guo <guoqh2(at)lenovo(dot)com>
To: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Cc: Zhiyu ZY13 Xu <xuzy13(at)lenovo(dot)com>
Subject: Three questions about Postgres Logical Replication
Date: 2019-07-15 10:02:35
Message-ID: HK0PR03MB469297B6DF24C9A4E74BE372F2CF0@HK0PR03MB4692.apcprd03.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

This is Jack from Lenovo. I experienced some problems when using Postgres Logical Replication.
Would you like to help me answer those question ? Thanks

Env: PostgreSQL 10.5
OS: Centos 6.9

1: There is a server which has been configured 13 publications.
All subscriptions configured on others server.
The publication and subscriptions service running well.
I found this directory have a lot of file. "/data/pg_logical/snapshots"

The number of log files grows over time. I could reproduce this scenario on test environment.
[postgres(at)bljcxjb3h4 snapshots]$ find /data/postgres/instance01/data/pg_logical/snapshots -type f | wc -l
783895

-rw------- 1 postgres postgres 144 Jun 3 14:36 0-300007A8.snap
-rw------- 1 postgres postgres 144 Jun 3 14:36 0-300007E0.snap
-rw------- 1 postgres postgres 144 Jun 3 14:36 0-30000888.snap
-rw------- 1 postgres postgres 144 Jun 4 23:00 0-30000AA8.snap

Could I know why this directory exist those file ? Could I manually delete those file ?

2: Recently. Our server restart unexpected. A file corrupt after server restart.
Follow in is logs.
CST,,0,PANIC,58P01,"replication slot file ""pg_replslot/f37tt19/state"" has wrong magic number: 1369563137 instead of 17112225",,,,,,,,,""
CST,,0,LOG,00000,"startup process (PID 3663) was terminated by signal 6: Aborted",,,,,,,,,""
CST,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,,""
CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""
Look like a replication slot file corrupt. This file corrupt block database activation.
I have to restore entire database to resolve this problem.
Is there any others workaround to resolve this problem ? Actually I expect to start database then rebuild logical replication.

3: Currently. Logical replication automatic start with database activation .DBA unable to control it.
I suggest start/stop it independent.

Browse pgsql-bugs by date

  From Date Subject
Next Message Manuel Rigger 2019-07-15 11:14:27 Re: SELECT with COLLATE results in segfault on trunk and 12 Beta 2
Previous Message Tom Lane 2019-07-14 22:22:50 Re: SELECT with COLLATE results in segfault on trunk and 12 Beta 2