Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-03-07 04:34:38
Message-ID: 40595e73-c7e1-463a-b8be-49792e870007@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 6, 2024, at 7:02 AM, Hayato Kuroda (Fujitsu) wrote:
> Thanks for updating the patch!

Thanks for the feedback. I'm attaching v26 that addresses most of your comments
and some issues pointed by Vignesh [1].

> >* I also removed the check for standby is running. If the standby was stopped a
> > long time ago, it will take some time to reach the start point.
> >* Dry run mode has to start / stop the service to work correctly. Is it an
> > issue?
>
> One concern (see below comment) is that -l option would not be passed even if
> the standby has been logging before running pg_createsubscriber. Also, some settings
> passed by pg_ctl start -o .... would not be restored.

That's a good point. We should state in the documentation that GUCs specified in
the command-line options are ignored during the execution.

> >However, I decided to include --retain option, I'm thinking about to remove it.
> >If the logging is enabled, the information during the pg_createsubscriber will
> >be available. The client log can be redirected to a file for future inspection.
>
> Just to confirm - you meant to say like below, right?
> * the client output would be redirected, and
> * -r option would be removed.

Yes. The logging_collector is usually enabled or the syslog is collecting the
log entries. Under reflection, another log directory to store entries for a
short period of time doesn't seem a good idea. It divides the information and
it also costs development time. The questions that make me think about it were:
Should I remove the pg_createsubscriber_output.d directory if it runs
successfully? What if there is an old file there? Is it another directory to
exclude while taking a backup? I also don't like the long directory name.

> Here are my initial comments for v25-0001. I read new doc and looks very good.
> I may do reviewing more about v25-0001, but feel free to revise.
>
> 01. cleanup_objects_atexit
> ```
> PGconn *conn;
> int i;
> ```
> The declaration *conn can be in the for-loop. Also, the declaration of the indicator can be in the bracket.

Changed.

> 02. cleanup_objects_atexit
> ```
>
> /*
> * If a connection could not be established, inform the user
> * that some objects were left on primary and should be
> * removed before trying again.
> */
> if (dbinfo[i].made_publication)
> {
> pg_log_warning("There might be a publication \"%s\" in database \"%s\" on primary",
> dbinfo[i].pubname, dbinfo[i].dbname);
> pg_log_warning_hint("Consider dropping this publication before trying again.");
> }
> if (dbinfo[i].made_replslot)
> {
> pg_log_warning("There might be a replication slot \"%s\" in database \"%s\" on primary",
> dbinfo[i].subname, dbinfo[i].dbname);
> pg_log_warning_hint("Drop this replication slot soon to avoid retention of WAL files.");
> }
> ```
>
> Not sure which is better, but we may able to the list to the concrete file like pg_upgrade.
> (I thought it had been already discussed, but could not find from the archive. Sorry if it was a duplicated comment)

Do you mean the replication slot file? I think the replication slot and the
server (primary) is sufficient for checking and fixing if required.

> 03. main
> ```
> while ((c = getopt_long(argc, argv, "d:D:nP:rS:t:v",
> long_options, &option_index)) != -1)
> ```
>
> Missing update for __shortopts.

Fixed.

> 04. main
> ```
> case 'D':
> opt.subscriber_dir = pg_strdup(optarg);
> canonicalize_path(opt.subscriber_dir);
> break;
> ...
> case 'P':
> opt.pub_conninfo_str = pg_strdup(optarg);
> break;
> ...
> case 's':
> opt.socket_dir = pg_strdup(optarg);
> break;
> ...
> case 'U':
> opt.sub_username = pg_strdup(optarg);
> break;
> ```
>
> Should we consider the case these options would be specified twice?
> I.e., should we call pg_free() before the substitution?

It isn't a concern for the other client tools. I think the reason is that it
doesn't continue to leak memory during the execution. I wouldn't bother with it.

> 05. main
>
> Missing canonicalize_path() to the socket_dir.

Fixed.

> 06. main
> ```
> /*
> * If socket directory is not provided, use the current directory.
> */
> ```
>
> One-line comment can be used. Period can be also removed at that time.

Fixed.

> 07. main
> ```
> /*
> *
> * If subscriber username is not provided, check if the environment
> * variable sets it. If not, obtain the operating system name of the user
> * running it.
> */
> ```
> Unnecessary blank.

Fixed.

> 08. main
> ```
> char *errstr = NULL;
> ```
>
> This declaration can be at else-part.

Fixed.

> 09. main.
>
> Also, as the first place, do we have to get username if not specified?
> I felt libpq can handle the case if we skip passing the info.

Are you suggesting that the username should be optional?

> 10. main
> ```
> appendPQExpBuffer(sub_conninfo_str, "host=%s port=%u user=%s fallback_application_name=%s",
> opt.socket_dir, opt.sub_port, opt.sub_username, progname);
> sub_base_conninfo = get_base_conninfo(sub_conninfo_str->data, NULL);
> ```
>
> Is it really needed to call get_base_conninfo? I think no need to define
> sub_base_conninfo.

No. Good catch. I removed it.

> 11. main
>
> ```
> /*
> * In dry run mode, the server is restarted with the provided command-line
> * options so validation can be applied in the target server. In order to
> * preserve the initial state of the server (running), start it without
> * the command-line options.
> */
> if (dry_run)
> start_standby_server(&opt, pg_ctl_path, NULL, false);
> ```
>
> I think initial state of the server may be stopped. Now both conditions are allowed.
> And I think it is not good not to specify the logfile.

Indeed, it is. I don't consider the following test as the first step because it
is conditional. The stop server is a precondition to start. The first step is
the start standby server so the initial state is stopped.

/*
* If the standby server is running, stop it. Some parameters (that can
* only be set at server start) are informed by command-line options.
*/
if (stat(pidfile, &statbuf) == 0)
{

pg_log_info("standby is up and running");
pg_log_info("stopping the server to start the transformation steps");
stop_standby_server(pg_ctl_path, opt.subscriber_dir);
}

/*
* Start a short-lived standby server with temporary parameters (provided
* by command-line options). The goal is to avoid connections during the
* transformation steps.
*/
pg_log_info("starting the standby with command-line options");
start_standby_server(&opt, pg_ctl_path, server_start_log, true);

> 12. others
>
> As Peter E pointed out [1], the main function is still huge. It has more than 400 lines.
> I think all functions should have less than 100 line to keep the readability.

The previous versions moved a lot of code into its own function to improve
readability. Since you mentioned it again, it did some refactor to move some
code outside the main function. I created 2 new functions: setup_recovery
(groups instructions in preparation for recovery) and
drop_primary_replication_slot (remove the primary replication slot if it
exists). At this point, the main steps are in their own functions making it
easier to understand the code IMO.

/* Get the absolute path of pg_ctl and pg_resetwal on the subscriber */
pg_ctl_path = get_exec_path(argv[0], "pg_ctl");
pg_resetwal_path = get_exec_path(argv[0], "pg_resetwal");
...
pg_log_info("Done!");

The code snippet above has ~ 120 lines and the majority of the lines are
comments.

> I considered separation idea like below. Note that this may require to change
> orderings. How do you think?
>
> * add parse_command_options() which accepts user options and verifies them
> * add verification_phase() or something which checks system identifier and calls check_XXX
> * add catchup_phase() or something which creates a temporary slot, writes recovery parameters,
> and wait until the end of recovery
> * add cleanup_phase() or something which removes primary_slot and modifies the
> system identifier
> * stop/start server can be combined into one wrapper.

IMO generic steps are more difficult to understand. I tend to avoid it. However,
as I said above, I moved some code into its own function. We could probably
consider grouping the check for required/optional arguments into its own
function. Other than that, it wouldn't reduce the main() size and increase the
readability.

> Attached txt file is proofs the concept.
>
> 13. others
>
> PQresultStatus(res) is called 17 times in this source code, it may be redundant.
> I think we can introduce a function like executeQueryOrDie() and gather in one place.

That's a good goal.

> 14. others
>
> I found that pg_createsubscriber does not refer functions declared in other files.
> Is there a possibility to use them, e.g., streamutils.h?

Which functions? IIRC we discussed it in the beginning of this thread but I
didn't find low-hanging fruits to use in this code.

> 15. others
>
> While reading the old discussions [2], Amit suggested to keep the comment and avoid
> creating a temporary slot. You said "Got it" but temp slot still exists.
> Is there any reason? Can you clarify your opinion?

I decided to refactor the code and does what Amit Kapila suggested: use the last
replication slot LSN as the replication start point. I keep it in a separate
patch (v26-0002) to make it easier to review. I'm planning to incorporate it if
nobody objects.

> 16. others
>
> While reading [2] and [3], I was confused the decision. You and Amit discussed
> the combination with pg_createsubscriber and slot sync and how should handle
> slots on the physical standby. You seemed to agree to remove such a slot, and
> Amit also suggested to raise an ERROR. However, you said in [8] that such
> handlings is not mandatory so should raise an WARNING in dry_run. I was quite confused.
> Am I missing something?

I didn't address this item in this patch. As you pointed out, pg_resetwal does
nothing if replication slots exist. That's not an excuse for doing nothing here.
I agree that we should check this case and provide a suitable error message. If
you have a complex replication scenario, users won't be happy with this
restriction. We can always improve the UI (dropping replication slots during the
execution if an option is provided, for example).

> 17. others
>
> Per discussion around [4], we might have to consider an if the some options like
> data_directory and config_file was initially specified for standby server. Another
> easy approach is to allow users to specify options like -o in pg_upgrade [5],
> which is similar to your idea. Thought?

I didn't address this item in this patch. I have a half baked patch for it. The
proposal is exactly to allow appending config_file option into -o.

pg_ctl start -o "-c config_file=/etc/postgresql/17/postgresql.conf" ...

>
> 18. others
>
> How do you handle the reported failure [6]?

It is a PEBCAK. I don't see an easy way to detect the scenario 1. In the current
mode, we are susceptible to this human failure. The base backup support, won't
allow it. Regarding scenario 2, the referred error is the way to capture this
wrong command line. Do you expect a different message?

> 19. main
> ```
> char *pub_base_conninfo = NULL;
> char *sub_base_conninfo = NULL;
> char *dbname_conninfo = NULL;
> ```
>
> No need to initialize pub_base_conninfo and sub_base_conninfo.
> These variables would not be free'd.

Changed.

> 20. others
>
> IIUC, slot creations would not be finished if there are prepared transactions.
> Should we detect it on the verification phase and raise an ERROR?

Maybe. If we decide to do it, we should also check all cases not just prepared
transactions. The other option is to add a sentence into the documentation.

> 21. others
>
> As I said in [7], the catch up would not be finished if long recovery_min_apply_delay
> is used. Should we overwrite during the catch up?

No. If the time-delayed logical replica [2] was available, I would say that we
could use the apply delay for the logical replica. The user can expect that the
replica will continue to have the configured apply delay but that's not the case
if it silently ignore it. I'm not sure if an error is appropriate in this case
because it requires an extra step. Another option is to print a message saying
there is an apply delay. In dry run mode, user can detect this case and make a
decision. Does it seem reasonable?

> 22. pg_createsubscriber.sgml
> ```
> <para>
> Check
> Write recovery parameters into the target data...
> ```
>
> Not sure, but "Check" seems not needed.

It was a typo. Fixed.

[1] https://www.postgresql.org/message-id/CALDaNm215LodC48p7LmfAsuCq9m33CWtcag2%2B9DiyNfWGuL_KQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/f026292b-c9ee-472e-beaa-d32c5c3a2ced%40www.fastmail.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v26-0001-pg_createsubscriber-creates-a-new-logical-replic.patch.gz application/gzip 21.8 KB
v26-0002-Use-latest-replication-slot-position-as-replicat.patch.gz application/gzip 2.0 KB
v26-0003-port-replace-int-with-string.patch.gz application/gzip 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2024-03-07 04:43:02 Re: speed up a logical replica setup
Previous Message Amit Kapila 2024-03-07 04:32:36 Re: DOCS: Avoid using abbreviation "aka"