Re: speed up a logical replica setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-02-20 11:32:20
Message-ID: CALDaNm2qHuZZvh6ym6OM367RfozU7RaaRDSm=F8M3SNrcQG2pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 19 Feb 2024 at 11:15, Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear hackers,
>
> > Since it may be useful, I will post top-up patch on Monday, if there are no
> > updating.
>
> And here are top-up patches. Feel free to check and include.
>
> v22-0001: Same as v21-0001.
> === rebased patches ===
> v22-0002: Update docs per recent changes. Same as v20-0002.
> v22-0003: Add check versions of the target. Extracted from v20-0003.
> v22-0004: Remove -S option. Mostly same as v20-0009, but commit massage was
> slightly changed.
> === Newbie ===
> V22-0005: Addressed my comments which seems to be trivial[1].
> Comments #1, 3, 4, 8, 10, 14, 17 were addressed here.
> v22-0006: Consider the scenario when commands are failed after the recovery.
> drop_subscription() is removed and some messages are added per [2].
> V22-0007: Revise server_is_in_recovery() per [1]. Comments #5, 6, 7, were addressed here.
> V22-0008: Fix a strange report when physical_primary_slot is null. Per comment #9 [1].
> V22-0009: Prohibit reuse publications when it has already existed. Per comments #11 and 12 [1].
> V22-0010: Avoid to call PQclear()/PQfinish()/pg_free() if the process exits soon. Per comment #15 [1].
> V22-0011: Update testcode. Per comments #17- [1].
>
> I did not handle below points because I have unclear points.
>
> a.
> This patch set cannot detect the disconnection between the target (standby) and
> source (primary) during the catch up. Because the connection status must be gotten
> at the same time (=in the same query) with the recovery status, but now it is now an
> independed function (server_is_in_recovery()).
>
> b.
> This patch set cannot detect the inconsistency reported by Shubham [3]. I could not
> come up with solutions without removing -P...

Few comments regarding the documentation:
1) max_replication_slots information seems to be present couple of times:

+ <para>
+ The target instance must have
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ and <link
linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ configured to a value greater than or equal to the number of target
+ databases.
+ </para>

+ <listitem>
+ <para>
+ The target instance must have
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ configured to a value greater than or equal to the number of target
+ databases and replication slots.
+ </para>
+ </listitem>

2) Can we add an id to prerequisites and use it instead of referring
to r1-app-pg_createsubscriber-1:
- <application>pg_createsubscriber</application> checks if the
given target data
- directory has the same system identifier than the source data directory.
- Since it uses the recovery process as one of the steps, it starts the
- target server as a replica from the source server. If the system
- identifier is not the same,
<application>pg_createsubscriber</application> will
- terminate with an error.
+ Checks the target can be converted. In particular, things listed in
+ <link linkend="r1-app-pg_createsubscriber-1">above section</link> would be
+ checked. If these are not met
<application>pg_createsubscriber</application>
+ will terminate with an error.
</para>

3) The code also checks the following:
Verify if a PostgreSQL binary (progname) is available in the same
directory as pg_createsubscriber.

But this is not present in the pre-requisites of documentation.

4) Here we mention that the target server should be stopped, but the
same is not mentioned in prerequisites:
+ Here is an example of using <application>pg_createsubscriber</application>.
+ Before running the command, please make sure target server is stopped.
+<screen>
+<prompt>$</prompt> <userinput>pg_ctl -D /usr/local/pgsql/data stop</userinput>
+</screen>
+

5) If there is an error during any of the pg_createsubscriber
operation like if create subscription fails, it might not be possible
to rollback to the earlier state which had physical-standby
replication. I felt we should document this and also add it to the
console message like how we do in case of pg_upgrade.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-02-20 11:39:37 Re: Replace current implementations in crypt() and gen_salt() to OpenSSL
Previous Message Amit Kapila 2024-02-20 11:31:27 Re: A new message seems missing a punctuation