Re: speed up a logical replica setup

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-03-05 03:48:35
Message-ID: CAHv8RjKst9j8GU7ZNNUcrOKsSc-fgO-yHRaJ1ze=uzJi-NrKUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 2, 2024 at 2:19 AM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>
> On Thu, Feb 22, 2024, at 12:45 PM, Hayato Kuroda (Fujitsu) wrote:
>
> Based on idea from Euler, I roughly implemented. Thought?
>
> 0001-0013 were not changed from the previous version.
>
> V24-0014: addressed your comment in the replied e-mail.
> V24-0015: Add disconnect_database() again, per [3]
> V24-0016: addressed your comment in [4].
> V24-0017: addressed your comment in [5].
> V24-0018: addressed your comment in [6].
>
>
> Thanks for your review. I'm attaching v25 that hopefully addresses all pending
> points.
>
> Regarding your comments [1] on v21, I included changes for almost all items
> except 2, 20, 23, 24, and 25. It still think item 2 is not required because
> pg_ctl will provide a suitable message. I decided not to rearrange the block of
> SQL commands (item 20) mainly because it would avoid these objects on node_f.
> Do we really need command_checks_all? Depending on the output it uses
> additional cycles than command_ok.
>
> In summary:
>
> v24-0002: documentation is updated. I didn't apply this patch as-is. Instead, I
> checked what you wrote and fix some gaps in what I've been written.
> v24-0003: as I said I don't think we need to add it, however, I won't fight
> against it if people want to add this check.
> v24-0004: I spent some time on it. This patch is not completed. I cleaned it up
> and include the start_standby_server code. It starts the server using the
> specified socket directory, port and username, hence, preventing external client
> connections during the execution.
> v24-0005: partially applied
> v24-0006: applied with cosmetic change
> v24-0007: applied with cosmetic change
> v24-0008: applied
> v24-0009: applied with cosmetic change
> v24-0010: not applied. Base on #15, I refactored this code a bit. pg_fatal is
> not used when there is a database connection open. Instead, pg_log_error()
> followed by disconnect_database(). In cases that it should exit immediately,
> disconnect_database() has a new parameter (exit_on_error) that controls if it
> needs to call exit(1). I go ahead and did the same for connect_database().
> v24-0011: partially applied. I included some of the suggestions (18, 19, and 21).
> v24-0012: not applied. Under reflection, after working on v24-0004, the target
> server will start with new parameters (that only accepts local connections),
> hence, during the execution is not possible anymore to detect if the target
> server is a primary for another server. I added a sentence for it in the
> documentation (Warning section).
> v24-0013: good catch. Applied.
> v24-0014: partially applied. After some experiments I decided to use a small
> number of attempts. The current code didn't reset the counter if the connection
> is reestablished. I included the documentation suggestion. I didn't include the
> IF EXISTS in the DROP PUBLICATION because it doesn't solve the issue. Instead,
> I refactored the drop_publication() to not try again if the DROP PUBLICATION
> failed.
> v24-0015: not applied. I refactored the exit code to do the right thing: print
> error message, disconnect database (if applicable) and exit.
> v24-0016: not applied. But checked if the information was presented in the
> documentation; it is.
> v24-0017: good catch. Applied.
> v24-0018: not applied.
>
> I removed almost all boolean return and include the error logic inside the
> function. It reads better. I also changed the connect|disconnect_database
> functions to include the error logic inside it. There is a new parameter
> on_error_exit for it. I removed the action parameter from pg_ctl_status() -- I
> think someone suggested it -- and the error message was moved to outside the
> function. I improved the cleanup routine. It provides useful information if it
> cannot remove the object (publication or replication slot) from the primary.
>
> Since I applied v24-0004, I realized that extra start / stop service are
> required. It mean pg_createsubscriber doesn't start the transformation with the
> current standby settings. Instead, it stops the standby if it is running and
> start it with the provided command-line options (socket, port,
> listen_addresses). It has a few drawbacks:
> * See v34-0012. It cannot detect if the target server is a primary for another
> server. It is documented.
> * I also removed the check for standby is running. If the standby was stopped a
> long time ago, it will take some time to reach the start point.
> * Dry run mode has to start / stop the service to work correctly. Is it an
> issue?
>
> However, I decided to include --retain option, I'm thinking about to remove it.
> If the logging is enabled, the information during the pg_createsubscriber will
> be available. The client log can be redirected to a file for future inspection.
>
> Comments?

I applied the v25 patch and did RUN the 'pg_createsubscriber' command.
I was unable to execute it and experienced the following error:

$ ./pg_createsubscriber -D node2/ -P "host=localhost port=5432
dbname=postgres" -d postgres -d db1 -p 9000 -r
./pg_createsubscriber: invalid option -- 'p'
pg_createsubscriber: hint: Try "pg_createsubscriber --help" for more
information.

Here, the --p is not accepting the desired port number. Thoughts?

Thanks and Regards,
Shubham Khanna.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-03-05 03:52:05 Re: MAINTAIN privilege -- what do we need to un-revert it?
Previous Message Amit Kapila 2024-03-05 03:45:20 Re: Synchronizing slots from primary to standby