Re: backup manifests

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2020-03-31 22:50:34
Message-ID: 20200331225034.odt3l2w4h3dr6ggk@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-31 14:10:34 -0400, Robert Haas wrote:
> I made an attempt to implement this.

Awesome!

> In the attached patch set, 0001 I'm going to work on those things. I
> would appreciate *very timely* feedback on anything people do or do
> not like about this, because I want to commit this patch set by the
> end of the work week and that isn't very far away. I would also
> appreciate if people would bear in mind the principle that half a loaf
> is better than none, and further improvements can be made in future
> releases.
>
> As part of my light testing, I tried promoting a standby that was
> running pg_basebackup, and found that pg_basebackup failed like this:
>
> pg_basebackup: error: could not get COPY data stream: ERROR: the
> standby was promoted during online backup
> HINT: This means that the backup being taken is corrupt and should
> not be used. Try taking another online backup.
> pg_basebackup: removing data directory "/Users/rhaas/pgslave2"
>
> My first thought was that this error message is hard to reconcile with
> this comment:
>
> /*
> * Send timeline history files too. Only the latest timeline history
> * file is required for recovery, and even that only if there happens
> * to be a timeline switch in the first WAL segment that contains the
> * checkpoint record, or if we're taking a base backup from a standby
> * server and the target timeline changes while the backup is taken.
> * But they are small and highly useful for debugging purposes, so
> * better include them all, always.
> */
>
> But then it occurred to me that this might be a cascading standby.

Yea. The check just prevents the walsender's database from being
promoted:

/*
* Check if the postmaster has signaled us to exit, and abort with an
* error in that case. The error handler further up will call
* do_pg_abort_backup() for us. Also check that if the backup was
* started while still in recovery, the server wasn't promoted.
* do_pg_stop_backup() will check that too, but it's better to stop
* the backup early than continue to the end and fail there.
*/
CHECK_FOR_INTERRUPTS();
if (RecoveryInProgress() != backup_started_in_recovery)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("the standby was promoted during online backup"),
errhint("This means that the backup being taken is corrupt "
"and should not be used. "
"Try taking another online backup.")));
and

if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("the standby was promoted during online backup"),
errhint("This means that the backup being taken is corrupt "
"and should not be used. "
"Try taking another online backup.")));

So that just prevents promotions of the current node, afaict.

> Regardless, it seems like a really good idea to store a list of WAL
> ranges rather than a single start/end/timeline, because even if it's
> impossible today it might become possible in the future.

Indeed.

> Still, unless there's an easy way to set up a test scenario where
> multiple WAL ranges need to be verified, it may be hard to test that
> this code actually behaves properly.

I think it'd be possible to test without a fully cascading setup, by
creating an initial base backup, then do some work to create a bunch of
new timelines, and then start the initial base backup. That'd have to
follow all those timelines. Not sure that's better than a cascading
setup though.

> +/*
> + * Add information about the WAL that will need to be replayed when restoring
> + * this backup to the manifest.
> + */
> +static void
> +AddWALInfoToManifest(manifest_info *manifest, XLogRecPtr startptr,
> + TimeLineID starttli, XLogRecPtr endptr, TimeLineID endtli)
> +{
> + List *timelines = readTimeLineHistory(endtli);

should probably happen after the manifest->buffile check.

> + ListCell *lc;
> + bool first_wal_range = true;
> + bool found_ending_tli = false;
> +
> + /* If there is no buffile, then the user doesn't want a manifest. */
> + if (manifest->buffile == NULL)
> + return;

Not really about this patch/function specifically: I wonder if this'd
look better if you added ManifestEnabled() macro instead of repeating
the comment repeatedly.

> + /* Unless --no-parse-wal was specified, we will need pg_waldump. */
> + if (!no_parse_wal)
> + {
> + int ret;
> +
> + pg_waldump_path = pg_malloc(MAXPGPATH);
> + ret = find_other_exec(argv[0], "pg_waldump",
> + "pg_waldump (PostgreSQL) " PG_VERSION "\n",
> + pg_waldump_path);
> + if (ret < 0)
> + {
> + char full_path[MAXPGPATH];
> +
> + if (find_my_exec(argv[0], full_path) < 0)
> + strlcpy(full_path, progname, sizeof(full_path));
> + if (ret == -1)
> + pg_log_fatal("The program \"%s\" is needed by %s but was\n"
> + "not found in the same directory as \"%s\".\n"
> + "Check your installation.",
> + "pg_waldump", "pg_validatebackup", full_path);
> + else
> + pg_log_fatal("The program \"%s\" was found by \"%s\" but was\n"
> + "not the same version as %s.\n"
> + "Check your installation.",
> + "pg_waldump", full_path, "pg_validatebackup");
> + }
> + }

ISTM, and this can definitely wait for another time, that we should have
one wrapper doing all of this, instead of having quite a few copies of
very similar logic to the above.

> +/*
> + * Attempt to parse the WAL files required to restore from backup using
> + * pg_waldump.
> + */
> +static void
> +parse_required_wal(validator_context *context, char *pg_waldump_path,
> + char *wal_directory, manifest_wal_range *first_wal_range)
> +{
> + manifest_wal_range *this_wal_range = first_wal_range;
> +
> + while (this_wal_range != NULL)
> + {
> + char *pg_waldump_cmd;
> +
> + pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%X --end=%X/%X\n",
> + pg_waldump_path, wal_directory, this_wal_range->tli,
> + (uint32) (this_wal_range->start_lsn >> 32),
> + (uint32) this_wal_range->start_lsn,
> + (uint32) (this_wal_range->end_lsn >> 32),
> + (uint32) this_wal_range->end_lsn);
> + if (system(pg_waldump_cmd) != 0)
> + report_backup_error(context,
> + "WAL parsing failed for timeline %u",
> + this_wal_range->tli);
> +
> + this_wal_range = this_wal_range->next;
> + }
> +}

Should we have a function to properly escape paths in cases like this?
Not that it's likely or really problematic, but the quoting for path
could be "circumvented".

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-03-31 22:53:57 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Tomas Vondra 2020-03-31 22:45:37 Re: [PATCH] Incremental sort (was: PoC: Partial sort)