Re: Duplicate history file?

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)oss(dot)nttdata(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)lists(dot)postgresql(dot)org, tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp
Subject: Re: Duplicate history file?
Date: 2021-06-16 05:38:30
Message-ID: 20210616053830.jicyv77pwkfxaozk@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 16, 2021 at 01:17:11AM -0400, Stephen Frost wrote:
>
> The archive command is technically invoked using the shell, but the
> interpretation of the exit code, for example, is only discussed in the C
> code, but it’s far from the only consideration that someone developing an
> archive command needs to understand.

The expectations for the return code are documented. There are some subtleties
for when the command is interrupted by a signal, which are also documented.
Why shouldn't we document the rest of the expectation of what such a command
should do?

> The notion that an archive command can be distanced from backups is really
> not reasonable in my opinion.

As far as I know you can use archiving for replication purpose only. In such
case you certainly will have different usage of the archived files compared to
backups, and different verifications. But the requirements on what makes an
archive_command safe won't change.

> > Consider that, really, an archive command should refuse to allow archiving
> > > of WAL on a timeline which doesn’t have a corresponding history file in
> > the
> > > archive for that timeline (excluding timeline 1).
> >
> > Yes, that's a clear requirement that should be documented.
>
>
> Is it a clear requirement that pgbackrest and every other organization that
> has developed an archive command has missed? Are you able to point to a
> tool which has such a check today?

I don't know, as I don't have any knowledge of what barman, BART, pgbackrest,
pg_probackup or any other backup solution does in any detail. I was only saying
that what you said makes sense and should be part of the documentation,
assuming that this is indeed a requirement.

> > Also, a backup tool
> > > should compare the result of pg_start_backup to what’s in the control
> > file,
> > > using a fresh read, after start backup returns to make sure that the
> > > storage is sane and not, say, cache’ing pages independently (such as
> > might
> > > happen with a separate NFS mount..). Oh, and if a replica is involved, a
> > > check should be done to see if the replica has changed timelines and an
> > > appropriate message thrown if that happens complaining that the backup
> > was
> > > aborted due to the promotion of the replica…
> >
> > I agree, but unless I'm missing something it's unrelated to what an
> > archive_command should be in charge of?
>
> I’m certainly not moved by this argument as it seems to be willfully
> missing the point. Further, if we are going to claim that we must document
> archive command to such level then surely we need to also document all the
> requirements for pg_start_backup and pg_stop_backup too, so this strikes me
> as entirely relevant.

So what was the point? I'm not saying that doing backup is trivial and/or
should not be properly documented, nor that we shouldn't improve
pg_start_backup or pg_stop_backup documentation, I'm just saying that those
doesn't change what makes an archive_command safe.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-06-16 05:47:45 Re: snapshot too old issues, first around wraparound and then more.
Previous Message Amit Kapila 2021-06-16 05:27:27 Re: [bug?] Missed parallel safety checks, and wrong parallel safety