Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "'Shubham Khanna'" <khannashubham1197(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-01-26 00:28:45
Message-ID: be92c57b-82e1-4920-ac31-a8a04206db7b@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 25, 2024, at 6:05 AM, Hayato Kuroda (Fujitsu) wrote:
> 01.
> ```
> /* Options */
> static char *pub_conninfo_str = NULL;
> static SimpleStringList database_names = {NULL, NULL};
> static int wait_seconds = DEFAULT_WAIT;
> static bool retain = false;
> static bool dry_run = false;
> ```
>
> Just to confirm - is there a policy why we store the specified options? If you
> want to store as global ones, username and port should follow (my fault...).
> Or, should we have a structure to store them?

It is a matter of style I would say. Check other client applications. Some of
them also use global variable. There are others that group options into a
struct. I would say that since it has a short lifetime, I don't think the
current style is harmful.

> 04.
> ```
> {"dry-run", no_argument, NULL, 'n'},
> ```
>
> I'm not sure why the dry_run mode exists. In terms pg_resetwal, it shows the
> which value would be changed based on the input. As for the pg_upgrade, it checks
> whether the node can be upgraded for now. I think, we should have the checking
> feature, so it should be renamed to --check. Also, the process should exit earlier
> at that time.

It is extremely useful because (a) you have a physical replication setup and
don't know if it is prepared for logical replication, (b) check GUCs (is
max_wal_senders sufficient for this pg_subscriber command? Or is
max_replication_slots sufficient to setup the logical replication even though I
already have some used replication slots?), (c) connectivity and (d)
credentials.

> 05.
> I felt we should accept some settings from enviroment variables, like pg_upgrade.
> Currently, below items should be acceted.
>
> - data directory
> - username
> - port
> - timeout

Maybe PGDATA.

> 06.
> ```
> pg_logging_set_level(PG_LOG_WARNING);
> ```
>
> If the default log level is warning, there are no ways to output debug logs.
> (-v option only raises one, so INFO would be output)
> I think it should be PG_LOG_INFO.

You need to specify multiple -v options.

> 07.
> Can we combine verifications into two functions, e.g., check_primary() and check_standby/check_subscriber()?

I think v9 does it.

> 08.
> Not sure, but if we want to record outputs by pg_subscriber, the sub-directory
> should be created. The name should contain the timestamp.

The log file already contains the timestamp. Why?

> 09.
> Not sure, but should we check max_slot_wal_keep_size of primary server? It can
> avoid to fail starting of logical replicaiton.

A broken physical replication *before* running this tool is its responsibility?
Hmm. We might add another check that can be noticed during dry run mode.

> 10.
> ```
> nslots_new = nslots_old + dbarr.ndbs;
>
> if (nslots_new > max_replication_slots)
> {
> pg_log_error("max_replication_slots (%d) must be greater than or equal to "
> "the number of replication slots required (%d)", max_replication_slots, nslots_new);
> exit(1);
> }
> ```
>
> I think standby server must not have replication slots. Because subsequent
> pg_resetwal command discards all the WAL file, so WAL records pointed by them
> are removed. Currently pg_resetwal does not raise ERROR at that time.

Again, dry run mode might provide a message for it.

> 11.
> ```
> /*
> * Stop the subscriber if it is a standby server. Before executing the
> * transformation steps, make sure the subscriber is not running because
> * one of the steps is to modify some recovery parameters that require a
> * restart.
> */
> if (stat(pidfile, &statbuf) == 0)
> ```
>
> I kept just in case, but I'm not sure it is still needed. How do you think?
> Removing it can reduce an inclusion of pidfile.h.

Are you suggesting another way to check if the standby is up and running?

> 12.
> ```
> pg_ctl_cmd = psprintf("\"%s/pg_ctl\" stop -D \"%s\" -s",
> standby.bindir, standby.pgdata);
> rc = system(pg_ctl_cmd);
> pg_ctl_status(pg_ctl_cmd, rc, 0);
> ```
>
>
> There are two places to stop the instance. Can you divide it into a function?

Yes.

> 13.
> ```
> * A temporary replication slot is not used here to avoid keeping a
> * replication connection open (depending when base backup was taken, the
> * connection should be open for a few hours).
> */
> conn = connect_database(primary.base_conninfo, dbarr.perdb[0].dbname);
> if (conn == NULL)
> exit(1);
> consistent_lsn = create_logical_replication_slot(conn, true,
> &dbarr.perdb[0]);
> ```
>
> I didn't notice the comment, but still not sure the reason. Why we must reserve
> the slot until pg_subscriber finishes? IIUC, the slot would be never used, it
> is created only for getting a consistent_lsn. So we do not have to keep.
> Also, just before, logical replication slots for each databases are created, so
> WAL records are surely reserved.

This comment needs to be updated. It was written at the time I was pursuing
base backup support too. It doesn't matter if you remove this transient
replication slot earlier because all of the replication slots created to the
subscriptions were created *before* the one for the consistent LSN. Hence, no
additional WAL retention due to this transient replication slot.

> 14.
>
> ```
> pg_log_info("starting the subscriber");
> start_standby_server(&standby, subport, server_start_log);
> ```
>
> This info should be in the function.

Ok.

> 15.
> ```
> /*
> * Create a subscription for each database.
> */
> for (i = 0; i < dbarr.ndbs; i++)
> ```
>
> This can be divided into a function, like create_all_subscriptions().

Ok.

> 16.
> My fault: usage() must be updated.
>
> 17. use_primary_slot_name
> ```
> if (PQntuples(res) != 1)
> {
> pg_log_error("could not obtain replication slot information: got %d rows, expected %d row",
> PQntuples(res), 1);
> return NULL;
> }
> ```
>
> Error message should be changed. I think this error means the standby has wrong primary_slot_name, right?

I refactored this code a bit but the message is the same. It detects 2 cases:
(a) you set primary_slot_name but you don't have a replication slot with the
same name and (b) a cannot-happen bug that provides > 1 rows. It is a broken
setup so maybe a hint saying so is enough.

> 18. misc
> Sometimes the pid of pg_subscriber is referred. It can be stored as global variable.

I prefer to keep getpid() call.

> 19.
> C99-style has been allowed, so loop variables like "i" can be declared in the for-statement, like
>
> ```
> for (int i = 0; i < MAX; i++)
> ```

v9 does it.

> 20.
> Some comments, docs, and outputs must be fixed when the name is changed.

Next patch.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2024-01-26 00:42:37 Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
Previous Message Michael Paquier 2024-01-25 23:35:19 Re: Make COPY format extendable: Extract COPY TO format implementations