Re: Allow logical replication in the same cluster

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Allow logical replication in the same cluster
Date: 2025-09-06 04:32:31
Message-ID: CAA4eK1L7pXvdSrJfPHAu6k+C_K7gPSrsBR7w32w2ih12OMGZEw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 5, 2025 at 8:21 AM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
> I was doing some test about logical replication a few days ago. When I tried to setup a logical replication on my Macbook.
>
> The basic workflow is simple:
>
> ```
> Step 1: edit postgresql.conf and set:
>
> wal_level = logical
> max_replication_slots = 4
> max_wal_senders = 4
>
> Step 2: create two databases for pub and sub
>
> % createdb pubdb
> % createdb subdb
>
> Step 3: create a table in pubdb, and create a publication
>
> pubdb=# CREATE TABLE t (id int primary key, data text);
> CREATE TABLE
>
> pubdb=# INSERT INTO t VALUES (1, 'hello from pub');
> INSERT 0 1
>
> pubdb=# CREATE PUBLICATION mypub FOR TABLE t;
> CREATE PUBLICATION
>
> Step 4: create the same table in subdb
>
> subdb=# CREATE TABLE t (id int primary key, data text);
> CREATE TABLE
>
> Step 5: create subscription in subdb
>
> subdb=# CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost dbname=pubdb' PUBLICATION mypub; <==== stuck here
> ```
>
> In step 5, "CREATE SUBSCRIPTION" got stuck. Then I found the issue had been discussed with [1] in 2017, but no more effort had been spent resolving the issue.
>
> Then I investigated the root cause. Feels like this is a architectural problem. Because both pubdb and subdb run in the same cluster, so they share the same transaction id serial.
>
> In step 5, when subdb "CREATE SUBSCRIPTION", say the transaction id is 100, what the backend worker process does is like:
>
> 1) start a xact (100)
> 2) insert a tuple into pg_subscription
> 3) request pub side to create a sub slot and wait for the result
> 4) commit
>
> When the pub side receives the request to create a replication slot, it needs to check no running transactions. However, xact 100 is running and waiting for replication slot creation to finish. This is a deadlock, and the deadlock exists only when pub and sub are in the same cluster.
>

You can avoid this problem by creating a slot first on publisher with
something like:
postgres=# select pg_create_logical_replication_slot('s1', 'pgoutput',
false, true);
pg_create_logical_replication_slot
------------------------------------
(s1,0/01BFF178)
(1 row)

Then while creating subscription you can use the above created slot as follows:
db1=# create subscription sub1 connection 'dbname=postgres'
publication pub1 WITH(create_slot=false, slot_name='s1');
CREATE SUBSCRIPTION

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2025-09-06 05:03:30 Re: Parallel Apply
Previous Message Amit Kapila 2025-09-06 04:12:05 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart