Re: Refactoring backend fork+exec code

From: Andres Freund <andres(at)anarazel(dot)de>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Tristan Partin <tristan(at)neon(dot)tech>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Refactoring backend fork+exec code
Date: 2023-11-30 20:26:48
Message-ID: 20231130202648.7k6agmuizdilufnv@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-11-30 01:36:25 +0200, Heikki Linnakangas wrote:
> - patch 1 splits CreateSharedMemoryAndSemaphores into two functions:
> CreateSharedMemoryAndSemaphores is now only called at postmaster startup,
> and a new function called AttachSharedMemoryStructs() is called in backends
> in EXEC_BACKEND mode. I extracted the common parts of those functions to a
> new static function. (Some of this refactoring used to be part of the 3rd
> patch in the series, but it seems useful on its own, so I split it out.)

I like that idea.

> From a96b6e92fdeaa947bf32774c425419b8f987b8e2 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Thu, 30 Nov 2023 00:01:25 +0200
> Subject: [PATCH v3 1/7] Refactor CreateSharedMemoryAndSemaphores
>
> For clarity, have separate functions for *creating* the shared memory
> and semaphores at postmaster or single-user backend startup, and
> for *attaching* to existing shared memory structures in EXEC_BACKEND
> case. CreateSharedMemoryAndSemaphores() is now called only at
> postmaster startup, and a new AttachSharedMemoryStructs() function is
> called at backend startup in EXEC_BACKEND mode.

I assume CreateSharedMemoryAndSemaphores() is still called during crash
restart? I wonder if it shouldn't split three ways:
1) create
2) initialize
3) attach

> From 3478cafcf74a5c8d649e0287e6c72669a29c0e70 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Thu, 30 Nov 2023 00:02:03 +0200
> Subject: [PATCH v3 2/7] Pass BackgroundWorker entry in the parameter file in
> EXEC_BACKEND mode
>
> This makes it possible to move InitProcess later in SubPostmasterMain
> (in next commit), as we no longer need to access shared memory to read
> background worker entry.

> static void read_backend_variables(char *id, Port *port);
> @@ -4831,7 +4833,7 @@ SubPostmasterMain(int argc, char *argv[])
> strcmp(argv[1], "--forkavlauncher") == 0 ||
> strcmp(argv[1], "--forkavworker") == 0 ||
> strcmp(argv[1], "--forkaux") == 0 ||
> - strncmp(argv[1], "--forkbgworker=", 15) == 0)
> + strncmp(argv[1], "--forkbgworker", 14) == 0)
> PGSharedMemoryReAttach();
> else
> PGSharedMemoryNoReAttach();
> @@ -4962,10 +4964,8 @@ SubPostmasterMain(int argc, char *argv[])
>
> AutoVacWorkerMain(argc - 2, argv + 2); /* does not return */
> }
> - if (strncmp(argv[1], "--forkbgworker=", 15) == 0)
> + if (strncmp(argv[1], "--forkbgworker", 14) == 0)

Now that we don't need to look at parameters anymore, these should probably be
just a strcmp(), like the other cases?

> From 0d071474e12a70ff8113c7b0731c5b97fec45007 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Wed, 29 Nov 2023 23:47:25 +0200
> Subject: [PATCH v3 3/7] Refactor how InitProcess is called
>
> The order of process initialization steps is now more consistent
> between !EXEC_BACKEND and EXEC_BACKEND modes. InitProcess() is called
> at the same place in either mode. We can now also move the
> AttachSharedMemoryStructs() call into InitProcess() itself. This
> reduces the number of "#ifdef EXEC_BACKEND" blocks.

Yay.

> diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
> index cdfdd6fbe1d..6c708777dde 100644
> --- a/src/backend/storage/lmgr/proc.c
> +++ b/src/backend/storage/lmgr/proc.c
> @@ -461,6 +461,12 @@ InitProcess(void)
> */
> InitLWLockAccess();
> InitDeadLockChecking();
> +
> +#ifdef EXEC_BACKEND
> + /* Attach process to shared data structures */
> + if (IsUnderPostmaster)
> + AttachSharedMemoryStructs();
> +#endif
> }
>
> /*
> @@ -614,6 +620,12 @@ InitAuxiliaryProcess(void)
> * Arrange to clean up at process exit.
> */
> on_shmem_exit(AuxiliaryProcKill, Int32GetDatum(proctype));
> +
> +#ifdef EXEC_BACKEND
> + /* Attach process to shared data structures */
> + if (IsUnderPostmaster)
> + AttachSharedMemoryStructs();
> +#endif
> }

Aside: Somewhat odd that InitAuxiliaryProcess() doesn't call
InitLWLockAccess().

I think a short comment explaining why we can attach to shmem structs after
already accessing shared memory earlier in the function would be worthwhile.

> From ce51876f87f1e4317e25baf64184749448fcd033 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Thu, 30 Nov 2023 00:07:34 +0200
> Subject: [PATCH v3 7/7] Refactor postmaster child process launching
>
> - Move code related to launching backend processes to new source file,
> launch_backend.c
>
> - Introduce new postmaster_child_launch() function that deals with the
> differences between EXEC_BACKEND and fork mode.
>
> - Refactor the mechanism of passing information from the parent to
> child process. Instead of using different command-line arguments when
> launching the child process in EXEC_BACKEND mode, pass a
> variable-length blob of data along with all the global variables. The
> contents of that blob depend on the kind of child process being
> launched. In !EXEC_BACKEND mode, we use the same blob, but it's simply
> inherited from the parent to child process.
>
> [...]
> 33 files changed, 1787 insertions(+), 2002 deletions(-)

Well, that's not small...

I think it may be worth splitting some of the file renaming out into a
separate commit, makes it harder to see what changed here.

> +AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
> {
> - pid_t AutoVacPID;
> + sigjmp_buf local_sigjmp_buf;
>
> -#ifdef EXEC_BACKEND
> - switch ((AutoVacPID = avlauncher_forkexec()))
> -#else
> - switch ((AutoVacPID = fork_process()))
> -#endif
> + /* Release postmaster's working memory context */
> + if (PostmasterContext)
> {
> - case -1:
> - ereport(LOG,
> - (errmsg("could not fork autovacuum launcher process: %m")));
> - return 0;
> -
> -#ifndef EXEC_BACKEND
> - case 0:
> - /* in postmaster child ... */
> - InitPostmasterChild();
> -
> - /* Close the postmaster's sockets */
> - ClosePostmasterPorts(false);
> -
> - AutoVacLauncherMain(0, NULL);
> - break;
> -#endif
> - default:
> - return (int) AutoVacPID;
> + MemoryContextDelete(PostmasterContext);
> + PostmasterContext = NULL;
> }
>
> - /* shouldn't get here */
> - return 0;
> -}

This if (PostmasterContext) ... else "shouldn't get here" business seems
pretty silly, more likely to hide problems than to help.

> +/*
> + * Information needed to launch different kinds of child processes.
> + */
> +static const struct
> +{
> + const char *name;
> + void (*main_fn) (char *startup_data, size_t startup_data_len);
> + bool shmem_attach;
> +} entry_kinds[] = {
> + [PMC_BACKEND] = {"backend", BackendMain, true},

Personally I'd give the struct an actual name - makes the debugging experience
a bit nicer than anonymous structs that you can't even reference by a typedef.

> + [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true},
> + [PMC_AV_WORKER] = {"autovacuum worker", AutoVacWorkerMain, true},
> + [PMC_BGWORKER] = {"bgworker", BackgroundWorkerMain, true},
> + [PMC_SYSLOGGER] = {"syslogger", SysLoggerMain, false},
> +
> + [PMC_STARTUP] = {"startup", StartupProcessMain, true},
> + [PMC_BGWRITER] = {"bgwriter", BackgroundWriterMain, true},
> + [PMC_ARCHIVER] = {"archiver", PgArchiverMain, true},
> + [PMC_CHECKPOINTER] = {"checkpointer", CheckpointerMain, true},
> + [PMC_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
> + [PMC_WAL_RECEIVER] = {"wal_receiver", WalReceiverMain, true},
> +};

It feels like we have too many different ways of documenting the type of a
process. This new PMC_ stuff, enum AuxProcType, enum BackendType. Which then
leads to code like this:

> -CheckpointerMain(void)
> +CheckpointerMain(char *startup_data, size_t startup_data_len)
> {
> sigjmp_buf local_sigjmp_buf;
> MemoryContext checkpointer_context;
>
> + Assert(startup_data_len == 0);
> +
> + MyAuxProcType = CheckpointerProcess;
> + MyBackendType = B_CHECKPOINTER;
> + AuxiliaryProcessInit();
> +

For each type of child process. That seems a bit too redundant. Can't we
unify this at least somewhat? Can't we just reuse BackendType here? Sure,
there'd be pointless entry for B_INVALID, but that doesn't seem like a
problem, could even be useful, by pointing it to a function raising an error.

At the very least this shouldn't deviate from the naming pattern of
BackendType.

> +/*
> + * SubPostmasterMain -- Get the fork/exec'd process into a state equivalent
> + * to what it would be if we'd simply forked on Unix, and then
> + * dispatch to the appropriate place.
> + *
> + * The first two command line arguments are expected to be "--forkchild=<name>",
> + * where <name> indicates which postmaster child we are to become, and
> + * the name of a variables file that we can read to load data that would
> + * have been inherited by fork() on Unix.
> + */
> +void
> +SubPostmasterMain(int argc, char *argv[])
> +{
> + PostmasterChildType child_type;
> + char *startup_data;
> + size_t startup_data_len;
> + char *entry_name;
> + bool found = false;
> +
> + /* In EXEC_BACKEND case we will not have inherited these settings */
> + IsPostmasterEnvironment = true;
> + whereToSendOutput = DestNone;
> +
> + /* Setup essential subsystems (to ensure elog() behaves sanely) */
> + InitializeGUCOptions();
> +
> + /* Check we got appropriate args */
> + if (argc != 3)
> + elog(FATAL, "invalid subpostmaster invocation");
> +
> + if (strncmp(argv[1], "--forkchild=", 12) != 0)
> + elog(FATAL, "invalid subpostmaster invocation (--forkchild argument missing)");
> + entry_name = argv[1] + 12;
> + found = false;
> + for (int idx = 0; idx < lengthof(entry_kinds); idx++)
> + {
> + if (strcmp(entry_kinds[idx].name, entry_name) == 0)
> + {
> + child_type = idx;
> + found = true;
> + break;
> + }
> + }
> + if (!found)
> + elog(ERROR, "unknown child kind %s", entry_name);

If we then have to search linearly, why don't we just pass the index into the
array?

>
> -#define StartupDataBase() StartChildProcess(StartupProcess)
> -#define StartArchiver() StartChildProcess(ArchiverProcess)
> -#define StartBackgroundWriter() StartChildProcess(BgWriterProcess)
> -#define StartCheckpointer() StartChildProcess(CheckpointerProcess)
> -#define StartWalWriter() StartChildProcess(WalWriterProcess)
> -#define StartWalReceiver() StartChildProcess(WalReceiverProcess)
> +#define StartupDataBase() StartChildProcess(PMC_STARTUP)
> +#define StartArchiver() StartChildProcess(PMC_ARCHIVER)
> +#define StartBackgroundWriter() StartChildProcess(PMC_BGWRITER)
> +#define StartCheckpointer() StartChildProcess(PMC_CHECKPOINTER)
> +#define StartWalWriter() StartChildProcess(PMC_WAL_WRITER)
> +#define StartWalReceiver() StartChildProcess(PMC_WAL_RECEIVER)
> +
> +#define StartAutoVacLauncher() StartChildProcess(PMC_AV_LAUNCHER);
> +#define StartAutoVacWorker() StartChildProcess(PMC_AV_WORKER);

Obviously not your fault, but these macros are so pointless... Making it
harder to find where we start child processes, all to save a a few characters
in one place, while adding considerably more in others.

> +void
> +BackendMain(char *startup_data, size_t startup_data_len)
> +{

Is there any remaining reason for this to live in postmaster.c? Given that
other backend types don't, that seems oddly assymmetrical.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-30 20:31:29 Re: Refactoring backend fork+exec code
Previous Message Robert Haas 2023-11-30 20:21:40 Re: Add missing error codes to PANIC/FATAL error reports in xlogrecovery