Re: Logical replication fails when adding multiple replicas

From: Will Roper <will(dot)roper(at)democracyclub(dot)org(dot)uk>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Logical replication fails when adding multiple replicas
Date: 2023-03-22 09:25:37
Message-ID: CA+xc_dtguFz+oLruj1sTsbE4Lcz0JtEqQi8s9LR01A7kXNJ0=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the response Hou,

I've had a look and when the tablesync workers are spinning up there are
some errors of the form:

"2023-03-17 18:37:06.900 UTC [4071] LOG: logical replication table
synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
started"
"2023-03-17 18:37:06.976 UTC [4071] ERROR: could not create replication
slot ""pg_37986_sync_37922_7210774007126708177"": ERROR: replication slot
""pg_37986_sync_37922_7210774007126708177"" already exists"
"2023-03-17 18:37:06.979 UTC [3964] LOG: background worker ""logical
replication worker"" (PID 4071) exited with exit code 1"

However further down a slot for the same table is successfully created:

"2023-03-17 18:38:41.232 UTC [4098] LOG: logical replication table
synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
started"

And it is not that table that gets stuck:

"2023-03-17 18:40:08.130 UTC [4052] ERROR: could not start WAL streaming:
ERROR: replication slot ""pg_37986_sync_37663_7210774007126708177"" does
not exist"
"2023-03-17 18:40:08.136 UTC [3964] LOG: background worker ""logical
replication worker"" (PID 4052) exited with exit code 1"
"2023-03-17 18:40:08.884 UTC [4149] LOG: logical replication table
synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
"2023-03-17 18:40:08.968 UTC [4149] ERROR: could not start WAL streaming:
ERROR: replication slot ""pg_37986_sync_37706_7210774007126708177"" does
not exist"
"2023-03-17 18:40:08.971 UTC [3964] LOG: background worker ""logical
replication worker"" (PID 4149) exited with exit code 1"
"2023-03-17 18:40:08.975 UTC [4150] LOG: logical replication table
synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"

I'm currently working around the issue by waiting until all tables are in '
STATE_READY' or 'STATE_FINISHED' then dropping the subscription, and
recreating it with 'copy_data=false' -
https://github.com/DemocracyClub/UK-Polling-Stations/blob/aada9def5102f44602dfaca420bb562a8d725870/deploy/files/scripts/setup_db_replication.sh#L46-L48

I've pasted in the complete logs from where I took the examples below, in
case you see something I've missed

Cheers,
Will

PS
The fujitsu postgres blog (https://www.postgresql.fastware.com/blog) is
great - thank you.

LOGS:

1679078208511,"2023-03-17 18:36:48.409 UTC [3964] LOG: starting PostgreSQL
14.7 (Ubuntu 14.7-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc
(Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, 64-bit"
1679078208511,"2023-03-17 18:36:48.409 UTC [3964] LOG: listening on IPv4
address ""127.0.0.1"", port 5432"
1679078208511,"2023-03-17 18:36:48.409 UTC [3964] LOG: listening on Unix
socket ""/var/run/postgresql/.s.PGSQL.5432"""
1679078208511,2023-03-17 18:36:48.411 UTC [3965] LOG: database system was
shut down at 2023-03-17 18:36:48 UTC
1679078212980,2023-03-17 18:36:48.416 UTC [3964] LOG: database system is
ready to accept connections
1679078215787,"2023-03-17 18:36:55.701 UTC [4021] LOG: logical replication
apply worker for subscription ""polling_stations_0561a02f66363d911"" has
started"
1679078215787,"2023-03-17 18:36:55.751 UTC [4022] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onsud"" has
started"
1679078216038,"2023-03-17 18:36:55.769 UTC [4023] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_user_groups"" has
started"
1679078216038,"2023-03-17 18:36:55.911 UTC [4022] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onsud"" has
finished"
1679078216038,"2023-03-17 18:36:55.915 UTC [4027] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""file_uploads_upload"" has
started"
1679078216038,"2023-03-17 18:36:55.924 UTC [4023] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_user_groups"" has
finished"
1679078216038,"2023-03-17 18:36:55.936 UTC [4028] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""auth_user_user_permissions"" has started"
1679078216289,"2023-03-17 18:36:56.035 UTC [4027] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""file_uploads_upload"" has
finished"
1679078216289,"2023-03-17 18:36:56.045 UTC [4029] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""councils_usercouncils"" has
started"
1679078216289,"2023-03-17 18:36:56.057 UTC [4028] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""auth_user_user_permissions"" has finished"
1679078216289,"2023-03-17 18:36:56.065 UTC [4030] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""data_finder_loggedpostcode"" has started"
1679078216289,"2023-03-17 18:36:56.159 UTC [4029] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""councils_usercouncils"" has
finished"
1679078216289,"2023-03-17 18:36:56.167 UTC [4032] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""data_importers_dataquality"" has started"
1679078216289,"2023-03-17 18:36:56.192 UTC [4030] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""data_finder_loggedpostcode"" has finished"
1679078216540,"2023-03-17 18:36:56.200 UTC [4033] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_address"" has
started"
1679078216540,"2023-03-17 18:36:56.294 UTC [4032] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""data_importers_dataquality"" has finished"
1679078216540,"2023-03-17 18:36:56.301 UTC [4035] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_customfinder"" has started"
1679078216540,"2023-03-17 18:36:56.374 UTC [4033] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_address"" has
finished"
1679078216540,"2023-03-17 18:36:56.382 UTC [4036] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""file_uploads_file"" has
started"
1679078216540,"2023-03-17 18:36:56.412 UTC [4035] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_customfinder"" has finished"
1679078216790,"2023-03-17 18:36:56.420 UTC [4039] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""spatial_ref_sys"" has
started"
1679078216790,"2023-03-17 18:36:56.546 UTC [4036] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""file_uploads_file"" has
finished"
1679078216790,"2023-03-17 18:36:56.551 UTC [4041] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078216790,"2023-03-17 18:36:56.604 UTC [4041] ERROR: could not create
replication slot ""pg_37986_sync_37706_7210774007126708177"": ERROR:
replication slot ""pg_37986_sync_37706_7210774007126708177"" already
exists"
1679078216790,"2023-03-17 18:36:56.608 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4041) exited with exit code 1"
1679078216790,"2023-03-17 18:36:56.616 UTC [4042] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_group_permissions""
has started"
1679078216790,"2023-03-17 18:36:56.704 UTC [4042] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_group_permissions""
has finished"
1679078217041,"2023-03-17 18:36:56.717 UTC [4044] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_migrations"" has
started"
1679078217041,"2023-03-17 18:36:56.812 UTC [4039] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""spatial_ref_sys"" has
finished"
1679078217041,"2023-03-17 18:36:56.816 UTC [4047] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_pollingstation"" has started"
1679078217041,"2023-03-17 18:36:56.825 UTC [4044] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_migrations"" has
finished"
1679078217041,"2023-03-17 18:36:56.841 UTC [4048] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_admin_log"" has
started"
1679078217041,"2023-03-17 18:36:56.970 UTC [4048] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_admin_log"" has
finished"
1679078217292,"2023-03-17 18:36:56.978 UTC [4052] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"
1679078217292,"2023-03-17 18:36:57.173 UTC [4047] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_pollingstation"" has finished"
1679078217542,"2023-03-17 18:36:57.183 UTC [4056] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_session"" has
started"
1679078217542,"2023-03-17 18:36:57.323 UTC [4056] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_session"" has
finished"
1679078217792,"2023-03-17 18:36:57.356 UTC [4057] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_pollingdistrict"" has started"
1679078217792,"2023-03-17 18:36:57.587 UTC [4057] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_pollingdistrict"" has finished"
1679078218043,"2023-03-17 18:36:57.661 UTC [4058] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_user"" has started"
1679078218043,"2023-03-17 18:36:57.821 UTC [4058] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_user"" has finished"
1679078218043,"2023-03-17 18:36:57.835 UTC [4059] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""councils_council"" has
started"
1679078218293,"2023-03-17 18:36:58.021 UTC [4059] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""councils_council"" has
finished"
1679078222980,"2023-03-17 18:36:58.112 UTC [4061] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""councils_councilgeography""
has started"
1679078224057,"2023-03-17 18:37:03.822 UTC [4061] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""councils_councilgeography""
has finished"
1679078226313,"2023-03-17 18:37:03.834 UTC [4065] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""feedback_feedback"" has
started"
1679078226313,"2023-03-17 18:37:06.177 UTC [4065] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""feedback_feedback"" has
finished"
1679078226563,"2023-03-17 18:37:06.184 UTC [4066] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_site"" has started"
1679078226563,"2023-03-17 18:37:06.322 UTC [4066] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_site"" has finished"
1679078226563,"2023-03-17 18:37:06.328 UTC [4067] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_content_type"" has
started"
1679078226563,"2023-03-17 18:37:06.462 UTC [4067] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""django_content_type"" has
finished"
1679078226814,"2023-03-17 18:37:06.467 UTC [4068] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_permission"" has
started"
1679078226814,"2023-03-17 18:37:06.590 UTC [4068] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_permission"" has
finished"
1679078226814,"2023-03-17 18:37:06.596 UTC [4069] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""authtoken_token"" has
started"
1679078226814,"2023-03-17 18:37:06.741 UTC [4069] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""authtoken_token"" has
finished"
1679078227065,"2023-03-17 18:37:06.747 UTC [4070] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_group"" has started"
1679078227065,"2023-03-17 18:37:06.892 UTC [4070] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""auth_group"" has finished"
1679078227065,"2023-03-17 18:37:06.900 UTC [4071] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
started"
1679078227065,"2023-03-17 18:37:06.976 UTC [4071] ERROR: could not create
replication slot ""pg_37986_sync_37922_7210774007126708177"": ERROR:
replication slot ""pg_37986_sync_37922_7210774007126708177"" already
exists"
1679078227065,"2023-03-17 18:37:06.979 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4071) exited with exit code 1"
1679078227316,"2023-03-17 18:37:06.983 UTC [4072] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_advancevotingstation"" has started"
1679078227316,"2023-03-17 18:37:07.196 UTC [4072] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table
""pollingstations_advancevotingstation"" has finished"
1679078227566,"2023-03-17 18:37:07.201 UTC [4073] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""bug_reports_bugreport"" has
started"
1679078227566,"2023-03-17 18:37:07.370 UTC [4073] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""bug_reports_bugreport"" has
finished"
1679078231981,"2023-03-17 18:37:07.377 UTC [4074] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078238098,2023-03-17 18:37:17.998 UTC [3966] LOG: checkpoints are
occurring too frequently (27 seconds apart)
1679078242980,"2023-03-17 18:37:17.998 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078261904,2023-03-17 18:37:41.895 UTC [3966] LOG: checkpoints are
occurring too frequently (24 seconds apart)
1679078265980,"2023-03-17 18:37:41.895 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078284964,2023-03-17 18:38:04.796 UTC [3966] LOG: checkpoints are
occurring too frequently (23 seconds apart)
1679078288981,"2023-03-17 18:38:04.796 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078306522,2023-03-17 18:38:26.427 UTC [3966] LOG: checkpoints are
occurring too frequently (22 seconds apart)
1679078310980,"2023-03-17 18:38:26.427 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078320313,"2023-03-17 18:38:40.216 UTC [4074] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37706_7210774007126708177"" does not exist"
1679078321316,"2023-03-17 18:38:40.222 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4074) exited with exit code 1"
1679078325075,"2023-03-17 18:38:41.232 UTC [4098] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
started"
1679078325075,2023-03-17 18:38:44.830 UTC [3966] LOG: checkpoints are
occurring too frequently (18 seconds apart)
1679078329981,"2023-03-17 18:38:44.830 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078341118,2023-03-17 18:39:01.053 UTC [3966] LOG: checkpoints are
occurring too frequently (17 seconds apart)
1679078345981,"2023-03-17 18:39:01.053 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078355665,2023-03-17 18:39:15.522 UTC [3966] LOG: checkpoints are
occurring too frequently (14 seconds apart)
1679078359981,"2023-03-17 18:39:15.522 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078369702,2023-03-17 18:39:29.501 UTC [3966] LOG: checkpoints are
occurring too frequently (14 seconds apart)
1679078373981,"2023-03-17 18:39:29.501 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078384001,2023-03-17 18:39:43.960 UTC [3966] LOG: checkpoints are
occurring too frequently (14 seconds apart)
1679078388980,"2023-03-17 18:39:43.960 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078398285,2023-03-17 18:39:58.227 UTC [3966] LOG: checkpoints are
occurring too frequently (15 seconds apart)
1679078402981,"2023-03-17 18:39:58.227 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078408316,"2023-03-17 18:40:08.130 UTC [4052] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37663_7210774007126708177"" does not exist"
1679078409068,"2023-03-17 18:40:08.136 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4052) exited with exit code 1"
1679078409069,"2023-03-17 18:40:08.884 UTC [4149] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078409069,"2023-03-17 18:40:08.968 UTC [4149] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37706_7210774007126708177"" does not exist"
1679078409069,"2023-03-17 18:40:08.971 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4149) exited with exit code 1"
1679078409069,"2023-03-17 18:40:08.975 UTC [4150] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"
1679078409069,"2023-03-17 18:40:09.040 UTC [4150] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37663_7210774007126708177"" does not exist"
1679078413980,"2023-03-17 18:40:09.043 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4150) exited with exit code 1"
1679078414331,"2023-03-17 18:40:14.053 UTC [4154] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078414331,"2023-03-17 18:40:14.108 UTC [4154] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37706_7210774007126708177"" does not exist"
1679078414331,"2023-03-17 18:40:14.111 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4154) exited with exit code 1"
1679078414331,"2023-03-17 18:40:14.115 UTC [4155] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"
1679078414331,"2023-03-17 18:40:14.168 UTC [4155] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37663_7210774007126708177"" does not exist"
1679078415586,"2023-03-17 18:40:14.171 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4155) exited with exit code 1"
1679078415586,2023-03-17 18:40:15.511 UTC [3966] LOG: checkpoints are
occurring too frequently (17 seconds apart)
1679078419982,"2023-03-17 18:40:15.511 UTC [3966] HINT: Consider
increasing the configuration parameter ""max_wal_size""."
1679078420098,"2023-03-17 18:40:19.930 UTC [4159] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078420098,"2023-03-17 18:40:19.998 UTC [4159] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37706_7210774007126708177"" does not exist"
1679078420098,"2023-03-17 18:40:20.001 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4159) exited with exit code 1"
1679078420349,"2023-03-17 18:40:20.004 UTC [4160] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"
1679078420349,"2023-03-17 18:40:20.138 UTC [4160] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37663_7210774007126708177"" does not exist"
1679078424980,"2023-03-17 18:40:20.141 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4160) exited with exit code 1"
1679078425616,"2023-03-17 18:40:25.400 UTC [4164] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078425616,"2023-03-17 18:40:25.481 UTC [4164] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37706_7210774007126708177"" does not exist"
1679078425616,"2023-03-17 18:40:25.484 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4164) exited with exit code 1"
1679078425616,"2023-03-17 18:40:25.487 UTC [4165] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"
1679078425616,"2023-03-17 18:40:25.563 UTC [4165] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37663_7210774007126708177"" does not exist"
1679078429981,"2023-03-17 18:40:25.566 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4165) exited with exit code 1"
1679078431133,"2023-03-17 18:40:31.093 UTC [4169] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_uprntocouncil""
has started"
1679078431383,"2023-03-17 18:40:31.130 UTC [4169] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37706_7210774007126708177"" does not exist"
1679078431383,"2023-03-17 18:40:31.134 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4169) exited with exit code 1"
1679078431383,"2023-03-17 18:40:31.138 UTC [4170] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""addressbase_address"" has
started"
1679078431383,"2023-03-17 18:40:31.168 UTC [4170] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37663_7210774007126708177"" does not exist"
1679078434391,"2023-03-17 18:40:31.171 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4170) exited with exit code 1"
1679078434641,"2023-03-17 18:40:34.385 UTC [4098] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37922_7210774007126708177"" does not exist"
1679078435644,"2023-03-17 18:40:34.409 UTC [3964] LOG: background worker
""logical replication worker"" (PID 4098) exited with exit code 1"
1679078435644,"2023-03-17 18:40:35.413 UTC [4173] LOG: logical replication
table synchronization worker for subscription
""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
started"
1679078435644,"2023-03-17 18:40:35.448 UTC [4173] ERROR: could not start
WAL streaming: ERROR: replication slot
""pg_37986_sync_37922_7210774007126708177"" does not exist"

On Wed, Mar 22, 2023 at 1:49 AM houzj(dot)fnst(at)fujitsu(dot)com <
houzj(dot)fnst(at)fujitsu(dot)com> wrote:

> On Monday, March 20, 2023 8:46 PM Will Roper <
> will(dot)roper(at)democracyclub(dot)org(dot)uk> wrote:
>
> Hi,
>
> > We’re having some issues with Postgresql’s logical replication.
> Specifically
> > trying to add several replicas at once. Essentially we can add replicas
> one at
> > a time, but when we try and add two or more together some of the table
> > subscriptions (as described in pg_subscription_rel) fail to get to
> > ‘STATE_READY’ (‘r’). Here’s a bit more detail.
> >
> > How the problem manifests/what we observe When we try to add several
> replicas
> > at the same time most of them have at least some tables/subscriptions
> (in the
> > pg_subscription_rel) that get stuck in STATE_FINISHEDCOPY and never reach
> > STATE_READY. The tables have the right number of rows, but the logs show
> > postgres is in a constant cycle of errors like:
> >
> > 2023-03-18 16:00:19.974 UTC [4207] LOG: logical replication table
> synchronization worker for subscription
> "polling_stations_01c7ee00f17021f94", table "uk_geo_utils_onspd" has started
> > 2023-03-18 16:00:19.999 UTC [4207] ERROR: could not start WAL
> streaming: ERROR: replication slot
> "pg_37982_sync_37918_7210774007126708177" does not exist
>
> Are there some other kind of ERROR messages on subscriber before the first
> "slot ... does
> not exist" ? I think it's possible that some other ERROR happened when
> first time
> setting the state to READY, and then the table sync keeps retrying and
> failing.
>
> Best Regards,
> Hou zj
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jim Vanns 2023-03-22 11:59:36 For temporary tables; truncate vs on commit delete all
Previous Message MS A 2023-03-22 07:05:12 Postgresql professional certification