Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-03-21 04:48:30
Message-ID: deda3517-36c7-4305-bcd9-0fe01767d177@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 18, 2024, at 2:43 AM, Hayato Kuroda (Fujitsu) wrote:
> Thanks for updating the patch. Here are my comments.
> I used Grammarly to proofread sentences.
> (The tool strongly recommends to use active voice, but I can ignore for now)

Thanks for another review. I posted a new patch (v32) that hopefully addresses
these points.

> 01.
>
> "After a successful run, the state of the target server is analagous to a fresh
> logical replication setup."
> a/analagous/analogous

Fixed.

> 02.
>
> "The main difference between the logical replication setup and pg_createsubscriber
> is the initial data copy."
>
> Grammarly suggests:
> "The initial data copy is the main difference between the logical replication
> setup and pg_createsubscriber."

Not fixed.

> 03.
>
> "Only the synchronization phase is done, which ensures each table is brought up
> to a synchronized state."
>
> This sentence is not very clear to me. How about:
> "pg_createsubscriber does only the synchronization phase, ensuring each table's
> replication state is ready."

I avoided pg_createsubscriber at the beginning because it is already
used in the previous sentence. I kept the last part of the sentence
because it is similar to one in the logical replication [1].

> 04.
>
> "The pg_createsubscriber targets large database systems because most of the
> execution time is spent making the initial data copy."
>
> Hmm, but initial data sync by logical replication also spends most of time to
> make the initial data copy. IIUC bottlenecks are a) this application must stop
> and start server several times, and b) only the serial copy works. Can you
> clarify them?

Reading the sentence again, it is not clear. When I said "most of
execution time" I was referring to the actual logical replication setup.

> 05.
>
> It is better to say the internal difference between pg_createsubscriber and the
> initial sync by logical replication. For example:
> pg_createsubscriber uses a physical replication mechanism to ensure the standby
> catches up until a certain point. Then, it converts to the standby to the
> subscriber by promoting and creating subscriptions.

Isn't it better to leave these details to "How It Works"?

> 06.
>
> "If these are not met an error will be reported."
>
> Grammarly suggests:
> "If these are not met, an error will be reported."

Fixed.

> 07.
>
> "The given target data directory must have the same system identifier than the
> source data directory."
>
> Grammarly suggests:
> "The given target data directory must have the same system identifier as the
> source data directory."

Fixed.

> 08.
>
> "If a standby server is running on the target data directory or it is a base
> backup from the source data directory, system identifiers are the same."
>
> This line is not needed if bullet-style is not used. The line is just a supplement,
> not prerequisite.

Fixed.

> 09.
>
> "The source server must accept connections from the target server. The source server must not be in recovery."
>
> Grammarly suggests:
> "The source server must accept connections from the target server and not be in recovery."

Not fixed.

> 10.
>
> "Publications cannot be created in a read-only cluster."
>
> Same as 08, this line is not needed if bullet-style is not used.

Fixed.

> 11.
>
> "pg_createsubscriber usually starts the target server with different connection
> settings during the transformation steps. Hence, connections to target server
> might fail."
>
> Grammarly suggests:
> "pg_createsubscriber usually starts the target server with different connection
> settings during transformation. Hence, connections to the target server might fail."

Fixed.

> 12.
>
> "During the recovery process,"
>
> Grammarly suggests:
> "During recovery,"

Not fixed. Our documentation uses "recovery process".

> 13.
>
> "replicated so an error would occur."
>
> Grammarly suggests:
> "replicated, so an error would occur."

I didn't find this one. Maybe you checked a previous version.

> 14.
>
> "It would avoid situations in which WAL files from the source server might be
> used by the target server."
>
> Grammarly suggests:
> "It would avoid situations in which the target server might use WAL files from
> the source server."

Fixed.

> 15.
>
> "a LSN"
>
> s/a/an

Fixed.

> 16.
>
> "of write-ahead"
>
> s/of/of the/

Fixed.

> 17.
>
> "specifies promote"
>
> We can do double-quote for the word promote.

Why? It is referring to recovery_target_action. If you check this GUC,
you will notice that it also uses literal tag.

> 18.
>
> "are also added so it avoids"
>
> Grammarly suggests:
> "are added to avoid"

Fixed.

> 19.
>
> "is accepting read-write transactions"
>
> Grammarly suggests:
> "accepts read-write transactions"

Not fixed.

> 20.
>
> New options must be also documented as well. This helps not only users but also
> reviewers.
> (Sometimes we cannot identify that the implementation is intentinal or not.)

I don't know what are you referring to? If the new options are
--publication, --subscription and --replication-slot, they are
documented. Are you checking the latest patch?

> 21.
>
> Also, not sure the specification is good. I preferred to specify them by format
> string. Because it can reduce the number of arguments and I cannot find use cases
> which user want to control the name of objects.
>
> However, your approach has a benefit which users can easily identify the generated
> objects by pg_createsubscriber. How do other think?

I prefer explicit options. We can always expand it later if people think it is a
good idea to provide a format string.

> 22.
>
> ```
> #define BASE_OUTPUT_DIR "pg_createsubscriber_output.d"
> ```
>
> No one refers the define.

It was removed in v30.

> 23.
>
> ```
> } CreateSubscriberOptions;
> ...
> } LogicalRepInfo;
> ```
>
> Declarations after the "{" are not needed, because we do not do typedef.

It is a leftover when I removed the typedef.

> 22.
>
> While seeing definitions of functions, I found that some pointers are declared
> as const, but others are not. E.g., "char *lsn" in setup_recovery() won' be
> changed but not the constant. Is it just missing or is there another rule?

It slipped my mind. Peter's fixups improves it.

> 23.
>
> ```
> static int num_dbs = 0;
> static int num_pubs = 0;
> static int num_subs = 0;
> static int num_replslots = 0;
> ```
>
> I think the name is bit confusing. The number of generating publications/subscriptions/replication slots
> are always same as the number of databases. They just indicate the number of
> specified.
>
> My idea is num_custom_pubs or something. Thought?

What does "custom" add to make the name clear? I added comments saying
so.

> 24.
>
> ```
> /* standby / subscriber data directory */
> static char *subscriber_dir = NULL;
> ```
>
> It is bit strange that only subscriber_dir is a global variable. Caller requires
> the CreateSubscriberOptions as an argument, except cleanup_objects_atexit() and
> main. So, how about makeing `CreateSubscriberOptions opt` to global one?

I avoided turning all the options global variables. Since the cleanup routine
required the target data directory to be a global variable, I just did it and
left the others alone.

> 25.
>
> ```
> * Replication slots, publications and subscriptions are created. Depending on
> * the step it failed, it should remove the already created objects if it is
> * possible (sometimes it won't work due to a connection issue).
> ```
>
> I think it should be specified here that subscriptions won't be removed with the
> reason.

I rephrased this comment.

> 26.
>
> ```
>
> /*
> * If the server is promoted, there is no way to use the current setup
> * again. Warn the user that a new replication setup should be done before
> * trying again.
> */
> ```
>
> Per comment 25, we can add a reference like "See comments atop the function"

It is a few lines above. I don't think you have to point it out. If you
are unsure about this decision, you should check the whole function.

> 27.
>
> usage() was not updated based on recent changes.

Check v30.

> 28.
>
> ```
> if (strcmp(conn_opt->keyword, "dbname") == 0 && conn_opt->val != NULL)
> {
> if (dbname)
> *dbname = pg_strdup(conn_opt->val);
> continue;
> }
> ```
>
> There is a memory-leak if multiple dbname are specified in the conninfo.

It is not a worrying or critical memory leak.

> 29.
>
> ```
> pg_prng_seed(&prng_state, (uint64) (getpid() ^ time(NULL)));
> ```
>
> No need to initialize the seed every time. Can you reuse pg_prng_state?

Sure.

> 30.
>
> ```
> if (num_replslots == 0)
> dbinfo[i].replslotname = pg_strdup(genname);
> ```
>
> I think the straightforward way is to use the name of subscription if no name
> is specified. This follows the rule for CREATE SUBSCRIPTION.

Agreed.

> 31.
>
> ```
> /* Create replication slot on publisher */
> if (lsn)
> pg_free(lsn);
> ```
>
> I think allocating/freeing memory is not so efficient.
> Can we add a flag to create_logical_replication_slot() for controlling the
> returning value (NULL or duplicated string)? We can use the condition (i == num_dbs-1)
> as flag.

It is not. This code path is not critical. You are suggesting to add
complexity here. Efficiency is a good goal but in this case it only adds
complexity with small return.

> 32.
>
> ```
> /*
> * Close the connection. If exit_on_error is true, it has an undesired
> * condition and it should exit immediately.
> */
> static void
> disconnect_database(PGconn *conn, bool exit_on_error)
> ```
>
> In case of disconnect_database(), the second argument should have different name.
> If it is true, the process exits unconditionally.
> Also, comments atop the function must be fixed.

I choose a short name. The comment seems ok to me.

>
> 33.
>
> ```
> wal_level = strdup(PQgetvalue(res, 0, 0));
> ```
>
> pg_strdup should be used here.

Fixed.

> 34.
>
> ```
> {"config-file", required_argument, NULL, 1},
> {"publication", required_argument, NULL, 2},
> {"replication-slot", required_argument, NULL, 3},
> {"subscription", required_argument, NULL, 4},
> ```
>
> The ordering looks strange for me. According to pg_upgarade and pg_basebackup,
> options which do not have short notation are listed behind.

Fixed.

> 35.
>
> ```
> opt.sub_port = palloc(16);
> ```
>
> Per other lines, pg_alloc() should be used.

I think you meant pg_malloc. Fixed.

> 36.
>
> ```
> pg_free(opt.sub_port);
> ```
>
> You said that the leak won't be concerned here. If so, why only 'p' has pg_free()?

Fixed.

> 37.
>
> ```
> /* Register a function to clean up objects in case of failure */
> atexit(cleanup_objects_atexit);
> ```
>
> Sorry if we have already discussed. I think the registration can be moved just
> before the boot of the standby. Before that, the callback will be no-op.

The main reason is to catch future cases added *before* the point you
want to move this call that requires a cleanup. As you said it is a
no-op. My preference for atexit() calls is to add it as earlier as
possible to avoid leaving cases that it should trigger.

> 38.
>
> ```
> /* Subscriber PID file */
> snprintf(pidfile, MAXPGPATH, "%s/postmaster.pid", subscriber_dir);
>
> /*
> * If the standby server is running, stop it. Some parameters (that can
> * only be set at server start) are informed by command-line options.
> */
> if (stat(pidfile, &statbuf) == 0)
> ```
>
> Hmm. pidfile is used only here, but it is declared in main(). Can it be
> separated into another funtion like is_standby_started()?

It is so small that I didn't bother adding a new function for it.

> 39.
>
> Or, we may able to introcue "restart_standby_if_needed" or something.
>
> 40.
>
> ```
> * XXX this code was extracted from BootStrapXLOG().
> ```
>
> So, can we extract the common part to somewhere? Since system identifier is related
> with the controldata file, I think it can be located in controldata_util.c.

I added this comment here as a reference from where I extracted the
code. The referred function is from backend. Feel free to propose a
separate patch for it.

> 41.
>
> You said like below in [1], but I could not find the related fix. Can you clarify?
>
> > That's a good point. We should state in the documentation that GUCs specified in
> > the command-line options are ignored during the execution.

I added a sentence for it. See "How It Works".

[1] https://www.postgresql.org/docs/current/logical-replication-architecture.html#LOGICAL-REPLICATION-SNAPSHOT

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2024-03-21 04:56:11 Re: speed up a logical replica setup
Previous Message John Naylor 2024-03-21 04:30:30 Re: add AVX2 support to simd.h