Re: Refactoring backend fork+exec code

From: "Tristan Partin" <tristan(at)neon(dot)tech>
To: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactoring backend fork+exec code
Date: 2023-07-10 21:07:24
Message-ID: CTYSRELGE6HH.FY67XNO5HXSH@gonk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> From 1d89eec53c7fefa7a4a8c011c9f19e3df64dc436 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Mon, 12 Jun 2023 16:33:20 +0300
> Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets

> @@ -831,7 +834,8 @@ StreamConnection(pgsocket server_fd, Port *port)
> void
> StreamClose(pgsocket sock)
> {
> - closesocket(sock);
> + if (closesocket(sock) != 0)
> + elog(LOG, "closesocket failed: %m");
> }
>
> /*

Do you think WARNING would be a more appropriate log level?

> From 2f518be9e96cfed1a1a49b4af8f7cb4a837aa784 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Mon, 12 Jun 2023 18:07:54 +0300
> Subject: [PATCH 5/9] Move "too many clients already" et al. checks from
> ProcessStartupPacket.

This seems like a change you could push already (assuming another
maintainer agrees with you), which makes reviews for this patchset even
easier.

> From c25b67c045018a2bf05e6ff53819d26e561fc83f Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Sun, 18 Jun 2023 14:11:16 +0300
> Subject: [PATCH 6/9] Pass CAC as argument to backend process.

Could you expand a bit more on why it is better to pass it as a separate
argument? Does it not fit well conceptually in struct Port?

> @@ -4498,15 +4510,19 @@ postmaster_forkexec(int argc, char *argv[])
> * returns the pid of the fork/exec'd process, or -1 on failure
> */
> static pid_t
> -backend_forkexec(Port *port)
> +backend_forkexec(Port *port, CAC_state cac)
> {
> - char *av[4];
> + char *av[5];
> int ac = 0;
> + char cacbuf[10];
>
> av[ac++] = "postgres";
> av[ac++] = "--forkbackend";
> av[ac++] = NULL; /* filled in by internal_forkexec */
>
> + snprintf(cacbuf, sizeof(cacbuf), "%d", (int) cac);
> + av[ac++] = cacbuf;

Might be worth a sanity check that there wasn't any truncation into
cacbuf, which is an impossibility as the code is written, but still
useful for catching a future developer error.

Is it worth adding a command line option at all instead of having the
naked positional argument? It would help anybody who might read the
command line what the seemingly random integer stands for.

> @@ -4910,7 +4926,10 @@ SubPostmasterMain(int argc, char *argv[])
> /* Run backend or appropriate child */
> if (strcmp(argv[1], "--forkbackend") == 0)
> {
> - Assert(argc == 3); /* shouldn't be any more args */
> + CAC_state cac;
> +
> + Assert(argc == 4);
> + cac = (CAC_state) atoi(argv[3]);

Perhaps an assert or full error checking that atoi succeeds would be
useful for similar reasons to my previous comment.

> From 658cba5cdb2e5c45faff84566906d2fcaa8a3674 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Mon, 12 Jun 2023 18:03:03 +0300
> Subject: [PATCH 7/9] Remove ConnCreate and ConnFree, and allocate Port in
> stack.

Again, seems like another patch that could be pushed separately assuming
others don't have any comments.

> From 65384b9a6cfb3b9b589041526216e0f64d64bea5 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Sun, 18 Jun 2023 13:56:44 +0300
> Subject: [PATCH 8/9] Introduce ClientSocket, rename some funcs

> @@ -1499,7 +1499,7 @@ CloseServerPorts(int status, Datum arg)
> {
> if (ListenSocket[i] != PGINVALID_SOCKET)
> {
> - StreamClose(ListenSocket[i]);
> + closesocket(ListenSocket[i]);
> ListenSocket[i] = PGINVALID_SOCKET;
> }
> }

I see you have been adding log messages in the case of closesocket()
failing. Do you think it is worth doing here as well?

One strange part about this patch is that in patch 4, you edit
StreamClose() to emit a log message in the case of closesocket()
failure, but then this patch just completely removes it.

> @@ -4407,11 +4420,11 @@ BackendInitialize(Port *port, CAC_state cac)
> * Doesn't return at all.
> */
> static void
> -BackendRun(Port *port)
> +BackendRun(void)
> {
> /*
> - * Create a per-backend PGPROC struct in shared memory. We must do
> - * this before we can use LWLocks (in AttachSharedMemoryAndSemaphores).
> + * Create a per-backend PGPROC struct in shared memory. We must do this
> + * before we can use LWLocks (in AttachSharedMemoryAndSemaphores).
> */
> InitProcess();

This comment reflow probably fits better in the patch that added the
AttachSharedMemoryAndSemaphores function.

> From b33cfeb28a5419045acb659a01410b2b463bea3e Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Sun, 18 Jun 2023 13:59:48 +0300
> Subject: [PATCH 9/9] Refactor postmaster child process launching

> - Move code related to launching backend processes to new source file,
> process_start.c

Since this seems pretty self-contained, my be easier to review if this
was its own commit.

> - Refactor the mechanism of passing informaton from the parent to
> child process. Instead of using different command-line arguments
> when launching the child process in EXEC_BACKEND mode, pass a
> variable-length blob of data along with all the global
> variables. The contents of that blob depends on the kind of child
> process being launched. In !EXEC_BACKEND mode, we use the same blob,
> but it's simply inherited from the parent to child process.

Same with this. Perhaps others would disagree.

> +const PMChildEntry entry_kinds[] = {
> + {"backend", BackendMain, true},
> +
> + {"autovacuum launcher", AutoVacLauncherMain, true},
> + {"autovacuum worker", AutoVacWorkerMain, true},
> + {"bgworker", BackgroundWorkerMain, true},
> + {"syslogger", SysLoggerMain, false},
> +
> + {"startup", StartupProcessMain, true},
> + {"bgwriter", BackgroundWriterMain, true},
> + {"archiver", PgArchiverMain, true},
> + {"checkpointer", CheckpointerMain, true},
> + {"wal_writer", WalWriterMain, true},
> + {"wal_receiver", WalReceiverMain, true},
> +};

It seems like this could be made static. I didn't see it getting exposed
in a header file anywhere, but I also admit that I can be blind at
times.

I need to spend more time looking at this last patch.

Nice work so far!

--
Tristan Partin
Neon (https://neon.tech)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-07-10 21:36:43 Re: Allowing parallel-safe initplans
Previous Message Michael Banck 2023-07-10 21:02:54 Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?