Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "'pgsql-hackers(at)lists(dot)postgresql(dot)org'" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-01-23 23:58:11
Message-ID: 73ab86ca-3fd5-49b3-9c80-73d1525202f1@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 22, 2024, at 4:06 AM, Hayato Kuroda (Fujitsu) wrote:
> I analyzed and found a reason. This is because publications are invisible for some transactions.
>
> As the first place, below operations were executed in this case.
> Tuples were inserted after getting consistent_lsn, but before starting the standby.
> After doing the workload, I confirmed again that the publication was created.
>
> 1. on primary, logical replication slots were created.
> 2. on primary, another replication slot was created.
> 3. ===on primary, some tuples were inserted. ===
> 4. on standby, a server process was started
> 5. on standby, the process waited until all changes have come.
> 6. on primary, publications were created.
> 7. on standby, subscriptions were created.
> 8. on standby, a replication progress for each subscriptions was set to given LSN (got at step2).
> =====pg_subscriber finished here=====
> 9. on standby, a server process was started again
> 10. on standby, subscriptions were enabled. They referred slots created at step1.
> 11. on primary, decoding was started but ERROR was raised.

Good catch! It is a design flaw.

> In this case, tuples were inserted *before creating publication*.
> So I thought that the decoded transaction could not see the publication because
> it was committed after insertions.
>
> One solution is to create a publication before creating a consistent slot.
> Changes which came before creating the slot were surely replicated to the standby,
> so upcoming transactions can see the object. We are planning to patch set to fix
> the issue in this approach.

I'll include a similar code in the next patch and also explain why we should
create the publication earlier. (I'm renaming
create_all_logical_replication_slots to setup_publisher and calling
create_publication from there and also adding the proposed GUC checks in it.)

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-01-24 00:39:38 Re: Remove pthread_is_threaded_np() checks in postmaster
Previous Message Euler Taveira 2024-01-23 23:44:15 Re: speed up a logical replica setup