Re: Trap errors from streaming child in pg_basebackup to exit early

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Trap errors from streaming child in pg_basebackup to exit early
Date: 2021-10-26 11:25:06
Message-ID: CAD21AoDCVrgdTU1QKgaAEk+8khFq=r4Sd6x79VZ3YqVYAHiaPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 29, 2021 at 8:18 PM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>
> > On 28 Sep 2021, at 15:48, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
> > Might be worth noting in one of the comments the difference in
> > behaviour if the backend process/thread *crashes* -- that is, on Unix
> > it will be detected via the signal handler and on Windows the whole
> > process including the main thread will die, so there is nothing to
> > detect.
>
> Good point, done.
>
> > Other places in the code just refers to the background process as "the
> > background process". The term "WAL receiver" is new from this patch.
> > While I agree it's in many ways a better term, I think (1) we should
> > try to be consistent, and (2) maybe use a different term than "WAL
> > receiver" specifically because we have a backend component with that
> > name.
>
> Looking at the user-facing messaging we have before this patch, there is a bit
> of variability:
>
> On UNIX:
>
> pg_log_error("could not create pipe for background process: %m");
> pg_log_error("could not create background process: %m");
> pg_log_info("could not send command to background pipe: %m");
> pg_log_error("could not wait for child process: %m");
>
> On Windows:
>
> pg_log_error("could not create background thread: %m");
> pg_log_error("could not get child thread exit status: %m");
> pg_log_error("could not wait for child thread: %m");
> pg_log_error("child thread exited with error %u", ..);
>
> On Both:
>
> pg_log_info("starting background WAL receiver");
> pg_log_info("waiting for background process to finish streaming ...");
>
> So there is one mention of a background WAL receiver already in there, but it's
> pretty inconsistent as to what we call it. For now I've changed the messaging
> in this patch to say "background process", leaving making this all consistent
> for a follow-up patch.
>
> The attached fixes the above, as well as the typo mentioned off-list and is
> rebased on top of todays HEAD.

Thank you for working on this issue.

The patch looks good to me but there is one minor comment:

--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int bgpipe[2] = {-1, -1};
/* Handle to child process */
static pid_t bgchild = -1;
static bool in_log_streamer = false;
+/* Flag to indicate if child process exited unexpectedly */
+static volatile sig_atomic_t bgchild_exited = false;

It's better to have a new line before the new comment.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2021-10-26 11:46:48 Re: Add connection active, idle time to pg_stat_activity
Previous Message Amul Sul 2021-10-26 10:59:55 Re: [Patch] ALTER SYSTEM READ ONLY