Re: Move backup-related code to xlogbackup.c/.h

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Move backup-related code to xlogbackup.c/.h
Date: 2022-10-14 08:24:41
Message-ID: 20221014082441.adpqpgr75nt7yxs3@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2022-Oct-13, Bharath Rupireddy wrote:

> The pg_backup_start_callback() can just go ahead and reset
> sessionBackupState. However, it leads us to the complete removal of
> pg_backup_start_callback() itself and use do_pg_abort_backup()
> consistently across, saving 20 LOC attached as v5-0001.

OK, that's not bad -- but there is a fatal flaw here: do_pg_backup_start
only sets sessionBackupState *after* it has finished setting things up,
so if you only change it like this, do_pg_abort_backup will indeed run,
but it'll do nothing because it hits the "quick exit" test. Therefore,
if a backup aborts while setting up, you'll keep running with forced
page writes until next postmaster crash or restart. Not good.

ISTM we need to give another flag to the callback function besides
emit_warning: one that says whether to test sessionBackupState or not.
I suppose the easiest way to do it with no other changes is to turn
'arg' into a bitmask.
But alternatively, we could just remove emit_warning as a flag and have
the warning be emitted always; then we can use the boolean for the other
purpose. I don't think the extra WARNING thrown during backup set-up is
going to be a problem, since it will mostly never be seen anyway (and if
you do see it, it's not a lie.)

However, what's most problematic about this patch is that it introduces
a pretty serious bug, yet that bug goes unnoticed if you just run the
builtin test suites. I only noticed because I added an elog(ERROR,
"oops") in the area protected by ENSURE_ERROR_CLEANUP and a debug
elog(WARNING) in the cleanup area, then examined the server log after
the pg_basebackup test filed; but this is not very workable. I wonder
what would be a good way to keep this in check. The naive way seems to
be to run a pg_basebackup, have it abort partway through (how?), then
test the server and see if forced page writes are enabled or not.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
(Paul Graham)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-10-14 08:33:14 Re: Move backup-related code to xlogbackup.c/.h
Previous Message Andy Fan 2022-10-14 08:07:21 Re: Question about pull_up_sublinks_qual_recurse