Re: base backup client as auxiliary backend process

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: base backup client as auxiliary backend process
Date: 2020-02-03 12:47:31
Message-ID: 20200203124731.ykmyps26ohan4fsx@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Comment:

- It'd be good to split out the feature independent refactorings, like
the introduction of InitControlFile(), into their own commit. Right
now it's hard to separate out what should just should be moved code,
and what should be behavioural changes. Especially when there's stuff
like just reindenting comments as part of the patch.

> @@ -886,12 +891,27 @@ PostmasterMain(int argc, char *argv[])
> /* Verify that DataDir looks reasonable */
> checkDataDir();
>
> - /* Check that pg_control exists */
> - checkControlFile();
> -
> /* And switch working directory into it */
> ChangeToDataDir();
>
> + if (stat(BASEBACKUP_SIGNAL_FILE, &stat_buf) == 0)
> + {
> + int fd;
> +
> + fd = BasicOpenFilePerm(STANDBY_SIGNAL_FILE, O_RDWR | PG_BINARY,
> + S_IRUSR | S_IWUSR);
> + if (fd >= 0)
> + {
> + (void) pg_fsync(fd);
> + close(fd);
> + }
> + basebackup_signal_file_found = true;
> + }
> +
> + /* Check that pg_control exists */
> + if (!basebackup_signal_file_found)
> + checkControlFile();
> +

This should be moved into its own function, rather than open coded in
PostmasterMain(). Imagine how PostmasterMain() would look if all the
check/initialization functions weren't extracted into functions.

> /*
> * Check for invalid combinations of GUC settings.
> */
> @@ -970,7 +990,8 @@ PostmasterMain(int argc, char *argv[])
> * processes will inherit the correct function pointer and not need to
> * repeat the test.
> */
> - LocalProcessControlFile(false);
> + if (!basebackup_signal_file_found)
> + LocalProcessControlFile(false);
>
> /*
> * Initialize SSL library, if specified.
> @@ -1386,6 +1407,39 @@ PostmasterMain(int argc, char *argv[])
> */
> AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STARTING);
>
> + if (basebackup_signal_file_found)
> + {

This imo *really* should be a separate function.

> + BaseBackupPID = StartBaseBackup();
> +
> + /*
> + * Wait until done. Start WAL receiver in the meantime, once base
> + * backup has received the starting position.
> + */
> + while (BaseBackupPID != 0)
> + {
> + PG_SETMASK(&UnBlockSig);
> + pg_usleep(1000000L);
> + PG_SETMASK(&BlockSig);
> + MaybeStartWalReceiver();
> + }

Is there seriously no better signalling that we can use than just
looping for a couple hours?

Is it actully guaranteed that a compiler wouldn't just load
BaseBackupPID into a register, and never see a change to it done in a
signal handler?

There should be a note mentioning that we'll just FATAL out if the base
backup process fails. Otherwise it's the obvious question reading this
code. Also - we have handling to restart WAL receiver, but there's no
handling for the base backup temporarily failing: Is that just because
its easy to do in one, but not the other case?

> + /*
> + * Reread the control file that came in with the base backup.
> + */
> + ReadControlFile();
> + }

Is it actualy rereading? I'm just reading the diff, so maybe I'm missing
something, but you've made LocalProcessControlFile not enter this code
path...

> @@ -2824,6 +2880,8 @@ pmdie(SIGNAL_ARGS)
>
> if (StartupPID != 0)
> signal_child(StartupPID, SIGTERM);
> + if (BaseBackupPID != 0)
> + signal_child(BaseBackupPID, SIGTERM);
> if (BgWriterPID != 0)
> signal_child(BgWriterPID, SIGTERM);
> if (WalReceiverPID != 0)
> @@ -3062,6 +3120,23 @@ reaper(SIGNAL_ARGS)

> continue;
> }
>
> + /*
> + * Was it the base backup process?
> + */
> + if (pid == BaseBackupPID)
> + {
> + BaseBackupPID = 0;
> + if (EXIT_STATUS_0(exitstatus))
> + ;
> + else if (EXIT_STATUS_1(exitstatus))
> + ereport(FATAL,
> + (errmsg("base backup failed")));
> + else
> + HandleChildCrash(pid, exitstatus,
> + _("base backup process"));
> + continue;
> + }
> +

What's the error handling for the case we shut down either because of
SIGTERM above, or here? Does all the code just deal with that the next
start? If not, what makes this safe?

> +/*
> + * base backup worker process (client) main function
> + */
> +void
> +BaseBackupMain(void)
> +{
> + WalReceiverConn *wrconn = NULL;
> + char *err;
> + TimeLineID primaryTLI;
> + uint64 primary_sysid;
> +
> + /* Load the libpq-specific functions */
> + load_file("libpqwalreceiver", false);
> + if (WalReceiverFunctions == NULL)
> + elog(ERROR, "libpqwalreceiver didn't initialize correctly");
> +
> + /* Establish the connection to the primary */
> + wrconn = walrcv_connect(PrimaryConnInfo, false, cluster_name[0] ? cluster_name : "basebackup", &err);
> + if (!wrconn)
> + ereport(ERROR,
> + (errmsg("could not connect to the primary server: %s", err)));
> +
> + /*
> + * Get the remote sysid and stick it into the local control file, so that
> + * the walreceiver is happy. The control file will later be overwritten
> + * by the base backup.
> + */
> + primary_sysid = strtoull(walrcv_identify_system(wrconn, &primaryTLI), NULL, 10);
> + InitControlFile(primary_sysid);
> + WriteControlFile();
> +
> + walrcv_base_backup(wrconn);
> +
> + walrcv_disconnect(wrconn);
> +
> + SyncDataDirectory(false, ERROR);
> +
> + ereport(LOG,
> + (errmsg("base backup completed")));
> + proc_exit(0);
> +}

So there's no error handling here (as in a sigsetjmp)? Nor any signal
handlers set up, despite
+ case BaseBackupProcess:
+ /* don't set signals, basebackup has its own agenda */
+ BaseBackupMain();
+ proc_exit(1); /* should never return */
+

You did set up forwarding of things like SIGHUP - but afaict that's not
correctly wired up?

> diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
> index e4fd1f9bb6..52819d504c 100644
> --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
> +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
> @@ -17,20 +17,29 @@
> +#include "pgtar.h"
> #include "pqexpbuffer.h"
> #include "replication/walreceiver.h"
> #include "utils/builtins.h"
> +#include "utils/guc.h"
> #include "utils/memutils.h"
> #include "utils/pg_lsn.h"
> +#include "utils/ps_status.h"
> #include "utils/tuplestore.h"
>
> PG_MODULE_MAGIC;
> @@ -61,6 +70,7 @@ static int libpqrcv_server_version(WalReceiverConn *conn);
> static void libpqrcv_readtimelinehistoryfile(WalReceiverConn *conn,
> TimeLineID tli, char **filename,
> char **content, int *len);
> +static void libpqrcv_base_backup(WalReceiverConn *conn);
> static bool libpqrcv_startstreaming(WalReceiverConn *conn,
> const WalRcvStreamOptions *options);
> static void libpqrcv_endstreaming(WalReceiverConn *conn,
> @@ -89,6 +99,7 @@ static WalReceiverFunctionsType PQWalReceiverFunctions = {
> libpqrcv_identify_system,
> libpqrcv_server_version,
> libpqrcv_readtimelinehistoryfile,
> + libpqrcv_base_backup,
> libpqrcv_startstreaming,
> libpqrcv_endstreaming,
> libpqrcv_receive,
> @@ -358,6 +369,395 @@ libpqrcv_server_version(WalReceiverConn *conn)
> return PQserverVersion(conn->streamConn);
> }
>
> +/*
> + * XXX copied from pg_basebackup.c
> + */
> +
> +unsigned long long totaldone;
> +unsigned long long totalsize_kb;
> +int tablespacenum;
> +int tablespacecount;
> +
> +static void
> +base_backup_report_progress(void)
> +{

Putting all of this into libpqwalreceiver.c seems like quite a
significant modularity violation. The header says:

* libpqwalreceiver.c
*
* This file contains the libpq-specific parts of walreceiver. It's
* loaded as a dynamic module to avoid linking the main server binary with
* libpq.

which really doesn't agree with all of the new stuff you're putting
here.

> --- a/src/backend/storage/file/fd.c
> +++ b/src/backend/storage/file/fd.c
> @@ -3154,21 +3154,14 @@ looks_like_temp_rel_name(const char *name)
> * Other symlinks are presumed to point at files we're not responsible
> * for fsyncing, and might not have privileges to write at all.
> *
> - * Errors are logged but not considered fatal; that's because this is used
> - * only during database startup, to deal with the possibility that there are
> - * issued-but-unsynced writes pending against the data directory. We want to
> - * ensure that such writes reach disk before anything that's done in the new
> - * run. However, aborting on error would result in failure to start for
> - * harmless cases such as read-only files in the data directory, and that's
> - * not good either.
> - *
> - * Note that if we previously crashed due to a PANIC on fsync(), we'll be
> - * rewriting all changes again during recovery.
> + * If pre_sync is true, issue flush requests to the kernel before starting the
> + * actual fsync calls. This can be skipped if the caller has already done it
> + * itself.
> *

Huh, what happened with the previous comments here?

> diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
> index f9cfeae264..c9edeb54d3 100644
> --- a/src/bin/pg_resetwal/pg_resetwal.c
> +++ b/src/bin/pg_resetwal/pg_resetwal.c
> @@ -76,7 +76,7 @@ static int WalSegSz;
> static int set_wal_segsize;
>
> static void CheckDataVersion(void);
> -static bool ReadControlFile(void);
> +static bool read_controlfile(void);
> static void GuessControlValues(void);
> static void PrintControlValues(bool guessed);
> static void PrintNewControlValues(void);
> @@ -393,7 +393,7 @@ main(int argc, char *argv[])
> /*
> * Attempt to read the existing pg_control file
> */
> - if (!ReadControlFile())
> + if (!read_controlfile())
> GuessControlValues();
>
> /*
> @@ -578,7 +578,7 @@ CheckDataVersion(void)
> * to the current format. (Currently we don't do anything of the sort.)
> */
> static bool
> -ReadControlFile(void)
> +read_controlfile(void)
> {
> int fd;
> int len;

Huh?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-02-03 12:48:49 Re: Experimenting with hash join prefetch
Previous Message Daniel Gustafsson 2020-02-03 12:40:48 Re: Add %x to PROMPT1 and PROMPT2