Re: Duplicate history file?

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: sfrost(at)snowman(dot)net, tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Duplicate history file?
Date: 2021-06-15 02:48:44
Message-ID: 20210615024844.lnemu7jytlwh3jcj@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 15, 2021 at 10:20:37AM +0900, Kyotaro Horiguchi wrote:
>
> Actually there's large room for losing data with cp. Yes, we would
> need additional redundancy of storage and periodical integrity
> inspection of the storage and archives on maybe need copies at the
> other sites on the other side of the Earth. But they are too-much for
> some kind of users. They have the right and responsibility to decide
> how durable/reliable their archive needs to be. (Putting aside some
> hardware/geological requirements :p)

Note that most of those considerations are orthogonal to what a proper
archive_command should be responsible for.

Yes users are responsible to decide they want valid and durable backup or
not, but we should assume a sensible default behavior, which is a valid and
durable archive_command. We don't document a default fsync = off with later
recommendation explaining why you shouldn't do that, and I think it should be
the same for the archive_command. The problem with the current documentation
is that many users will just blindly copy/paste whatever is in the
documentation without reading any further.

As an example, a few hours ago some french user on the french bulletin board
said that he fixed his "postmaster.pid already exists" error with a
pg_resetxlog -f, referring to some article explaining how to start postgres in
case of "PANIC: could not locate a valid checkpoint record". Arguably
that article didn't bother to document what are the implication for executing
pg_resetxlog, but given that the user original problem had literally nothing to
do with what was documented, I really doubt that it would have changed
anything.

> If we mandate some
> characteristics on the archive_command, we should take them into core.

I agree.

> I remember I saw some discussions on archive command on this line but
> I think it had ended at the point something like that "we cannot
> design one-fits-all interface comforming the requirements" or
> something (sorry, I don't remember in its detail..)

I also agree, but this problem is solved by making archive_command
customisable. Providing something that can reliably work in some general and
limited cases would be a huge improvement.

> Well. rman used rsync/ssh in its documentation in the past and now
> looks like providing barman-wal-archive so it seems that you're right
> in that point. So, do we recommend them in our documentation? (I'm
> not sure they are actually comform the requirement, though..)

We could maybe bless some third party backup solutions, but this will probably
lead to a lot more discussions, so it's better to discuss that in a different
thread. Note that I don't have a deep knowledge of any of those tools so I
don't have an opinion.

> If we write an example with a pseudo tool name, requiring some
> characteristics on the tool, then not telling about the minimal tools,
> I think that it is equivalent to that we are inhibiting certain users
> from using archive_command even if they really don't want such level
> of durability.

I already saw customers complaining about losing backups because their
archive_command didn't ensure that the copy was durable. Some users may not
care about losing their backups in such case, but I really think that the
majority of users expect a backup to be valid, durable and everything without
even thinking that it may not be the case. It should be the default behavior,
not optional.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-06-15 02:49:21 Re: PG 14 release notes, first draft
Previous Message Peter Geoghegan 2021-06-15 02:46:17 Re: Teaching users how they can get the most out of HOT in Postgres 14