Re: Trap errors from streaming child in pg_basebackup to exit early

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Trap errors from streaming child in pg_basebackup to exit early
Date: 2021-09-01 10:28:44
Message-ID: CALj2ACWZdKuohWDrcwL5AfSSZOHKnnhuFCTOe2N17jZuZpR0zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> A v2 with the above fixes is attached.

Thanks for the updated patch. Here are some comments:

1) Do we need to set bgchild = -1 before the exit(1); in the code
below so that we don't kill(bgchild, SIGTERM); unnecessarily in
kill_bgchild_atexit?
+ if (bgchild_exited)
+ {
+ pg_log_error("background WAL receiver terminated unexpectedly");
+ exit(1);
+ }
+

2) Missing "," after "On Windows, we use a ....."
+ * that time. On Windows we use a background thread which can communicate

3) How about "/* Flag to indicate whether or not child process exited
*/" instead of +/* State of child process */?

4) Instead of just exiting from the main pg_basebackup process when
the child WAL receiver dies, can't we think of restarting the child
process, probably with the WAL streaming position where it left off or
stream from the beginning? This way, the work that the main
pg_basebackup has done so far doesn't get wasted. I'm not sure if this
affects the pg_basebackup functionality. We can restart the child
process for 1 or 2 times, if it still dies, we can kill the main
pg_baasebackup process too. Thoughts?

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tony Reix 2021-09-01 10:29:32 Re: AIX: Symbols are missing in libpq.a
Previous Message kuroda.hayato@fujitsu.com 2021-09-01 10:04:21 RE: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)