Re: Trap errors from streaming child in pg_basebackup to exit early

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Trap errors from streaming child in pg_basebackup to exit early
Date: 2021-09-03 15:03:45
Message-ID: CALj2ACVkc8rQPMaRMHQvk4ej=J8PW=iF3Zb=d_m8y2n2xPTyFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 3, 2021 at 3:23 PM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> > 4) Instead of just exiting from the main pg_basebackup process when
> > the child WAL receiver dies, can't we think of restarting the child
> > process, probably with the WAL streaming position where it left off or
> > stream from the beginning? This way, the work that the main
> > pg_basebackup has done so far doesn't get wasted. I'm not sure if this
> > affects the pg_basebackup functionality. We can restart the child
> > process for 1 or 2 times, if it still dies, we can kill the main
> > pg_baasebackup process too. Thoughts?
>
> I was toying with the idea, but I ended up not pursuing it. This error is well
> into the “really shouldn’t happen, but can” territory and it’s quite likely
> that some level of manual intervention is required to make it successfully
> restart. I’m not convinced that adding complicated logic to restart (and even
> more complicated tests to simulate and test it) will be worthwhile.

I withdraw my suggestion because I now feel that it's better not to
make it complex and let the user decide if at all the child process
exits abnormally.

I think we might still miss abnormal child thread exits on Windows
because we set bgchild_exited = true only if ReceiveXlogStream or
walmethod->finish() returns false. I'm not sure the parent thread on
Windows can detect a child's abnormal exit. Since there is no signal
mechanism on Windows, what the patch does is better to detect child
exit on two important functions failures.

Overall, the v3 patch looks good to me.

Regards,
Bharath Rupireddy.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-09-03 15:49:34 Re: pg_receivewal starting position
Previous Message Peter Eisentraut 2021-09-03 14:58:33 Re: [PATCH] Make pkg-config files cross-compile friendly