Re: Logical replication fails when adding multiple replicas

From: Will Roper <will(dot)roper(at)democracyclub(dot)org(dot)uk>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: houzj(dot)fnst(at)fujitsu(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: Logical replication fails when adding multiple replicas
Date: 2023-03-23 17:43:21
Message-ID: CA+xc_dtJhQGTqeb+OOdcJjknrchjdFs3miGgrMcA6W0dkUZW0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

OK, that makes sense. I think something that is unique to subscribers is
sensible, postmaster startup time sounds reasonable!
Thanks for looking at it.

On Thu, Mar 23, 2023 at 8:17 AM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
wrote:

> At Wed, 22 Mar 2023 09:25:37 +0000, Will Roper <
> will(dot)roper(at)democracyclub(dot)org(dot)uk> wrote in
> > Thanks for the response Hou,
> >
> > I've had a look and when the tablesync workers are spinning up there are
> > some errors of the form:
> >
> > "2023-03-17 18:37:06.900 UTC [4071] LOG: logical replication table
> > synchronization worker for subscription
> > ""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
> > started"
> > "2023-03-17 18:37:06.976 UTC [4071] ERROR: could not create replication
> > slot ""pg_37986_sync_37922_7210774007126708177"": ERROR: replication
> slot
> > ""pg_37986_sync_37922_7210774007126708177"" already exists"
>
> The slot name format is "pg_<suboid>_sync_<relid>_<systemid>". It's no
> surprise this happens if the subscribers come from the same
> backup.
>
> If that's true, the simplest workaround would be to recreate the
> subscription multiple times, using a different number of repetitions
> for each subscriber so that the subscribers have subscriptions with
> different OIDs.
>
>
>
> I believe it's not prohitibed for subscribers to have the same system
> identifer, but the slot name generation logic for tablesync doesn't
> account for cases like this. We might need some server-wide value
> that's unique among subscribers and stable while table sync is
> running. I can't think of a better place than pg_subscription but I
> don't like it because it's not really necessary most of the the
> subscription's life.
>
> Do you think using the postmaster's startup time would work for this
> purpose? I'm assuming that the slot name doesn't need to persist
> across server restarts, but I'm not sure that's really true.
>
>
> diff --git a/src/backend/replication/logical/tablesync.c
> b/src/backend/replication/logical/tablesync.c
> index 07eea504ba..a5b4f7cf7c 100644
> --- a/src/backend/replication/logical/tablesync.c
> +++ b/src/backend/replication/logical/tablesync.c
> @@ -1214,7 +1214,7 @@ ReplicationSlotNameForTablesync(Oid suboid, Oid
> relid,
> char
> *syncslotname, Size szslot)
> {
> snprintf(syncslotname, szslot, "pg_%u_sync_%u_" UINT64_FORMAT,
> suboid,
> - relid, GetSystemIdentifier());
> + relid, PgStartTime);
> }
>
> /*
>
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bryn Llewellyn 2023-03-23 19:49:34 Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?
Previous Message Adrian Klaver 2023-03-23 16:59:50 Re: How to install vacuumlo on a client?