Re: Detecting some cases of missing backup_label

From: David Steele <david(at)pgmasters(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: Detecting some cases of missing backup_label
Date: 2023-12-21 12:26:29
Message-ID: 593dc2c5-4749-4925-b1e4-e952fe659752@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/21/23 07:37, Andres Freund wrote:
>
> On 2023-12-20 13:11:37 -0400, David Steele wrote:
>> I've run this through a bunch of scenarios (in my head) with parallel
>> backups and it does seem to hold up.
>>
>> I think we'd need to write the state file before XLOG_BACKUP_START just in
>> case. Seems better to have an extra state file rather than have one be
>> missing.
>
> That'd very significantly weaken the approach, afaict, because "external" base
> base backup could end up copying those files. The whole point is to detect
> broken procedures, so relying on such files being excluded from the base
> backup seems like a bad idea.
>
> I also see no need to do so - because we'd only verify that a backup start has
> been replayed when replaying XLOG_BACKUP_STOP there's no danger in not
> creating the files during XLOG_BACKUP_START, but doing so just before logging
> the XLOG_BACKUP_STOP.

Ugh, I meant XLOG_BACKUP_STOP. So sounds like we are on the same page.

>> Probably we'd want to exclude *all* state files from backups, though.
>
> I don't think so - I think we want the opposite? As noted above, I think in a
> safety net like this we shouldn't assume that backup procedures were followed
> correctly.

Fair enough.

>> Seems like in various PITR scenarios it could be hard to determine when to
>> remove them.
>
> Why? I think we can basically remove the files when:
>
> a) after the checkpoint during which XLOG_BACKUP_STOP was replayed - I think
> we already have the infrastructure to queue file deletions that we can hook
> into
> b) when replaying a shutdown checkpoint / after creation of a shutdown
> checkpoint

I thought about this some more. I *think* any state files a backup can
see would have to be for XLOG_BACKUP_STOP records generated during the
backup and they would get removed before the cluster had recovered to
consistency.

I'd still prefer to exclude state files from the backup, but I agree
there is no actual need to do so.

Regards,
-David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-12-21 12:27:38 Re: Building PosgresSQL with LLVM fails on Solaris 11.4
Previous Message shveta malik 2023-12-21 12:08:35 Re: Track in pg_replication_slots the reason why slots conflict?