Re: Duplicate history file?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: sfrost(at)snowman(dot)net
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp
Subject: Re: Duplicate history file?
Date: 2021-06-08 04:17:53
Message-ID: 20210608.131753.634245279577988155.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Yeah, it's hot these days...

At Tue, 08 Jun 2021 12:04:43 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> (Mmm. thunderbird or gmail connects this thread to the previous one..)
>
> At Mon, 7 Jun 2021 14:20:38 -0400, Stephen Frost <sfrost(at)snowman(dot)net> wrote in
> > Greetings,
> >
> > * Kyotaro Horiguchi (horikyota(dot)ntt(at)gmail(dot)com) wrote:
> > > So, this is the new new thread.
> >
> > This is definitely not the way I would recommend starting up a new
> > thread as you didn't include the actual text of the prior discussion for
> > people to be able to read and respond to, instead making them go hunt
> > for the prior discussion on the old thread and negating the point of
> > starting a new thread..
>
> Sorry for that. I'll do that next time.
>
> > Still, I went and found the other email-
>
> Thanks!
>
> > * Kyotaro Horiguchi (horikyota(dot)ntt(at)gmail(dot)com) wrote:
> > > At Mon, 31 May 2021 11:52:05 +0900, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp> wrote in
> > > > Since the above behavior is different from the behavior of the
> > > > test command in the following example in postgresql.conf, I think
> > > > we should write a note about this example.
> > > >
> > > > # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p
> > > > # /mnt/server/archivedir/%f'
> > > >
> > > > Let me describe the problem we faced.
> > > > - When archive_mode=always, archive_command is (sometimes) executed
> > > > in a situation where the history file already exists on the standby
> > > > side.
> > > >
> > > > - In this case, if "test ! -f" is written in the archive_command of
> > > > postgresql.conf on the standby side, the command will keep failing.
> > > >
> > > > Note that this problem does not occur when archive_mode=on.
> > > >
> > > > So, what should we do for the user? I think we should put some notes
> > > > in postgresql.conf or in the documentation. For example, something
> > > > like this:
> >
> > First off, we should tell them to not use test or cp in their actual
> > archive command because they don't do things like make sure that the WAL
> > that's been archived has actually been fsync'd. Multiple people have
> > tried to make improvements in this area but the long and short of it is
> > that trying to provide a simple archive command in the documentation
> > that actually *works* isn't enough- you need a real tool. Maybe someone
> > will write one some day that's part of core, but it's not happened yet
> > and instead there's external solutions which actually do the correct
> > things.
>
> Ideally I agree that it is definitely right. But the documentation
> doesn't say a bit of "don't use the simple copy command in any case
> (or at least the cases where more than a certain level of durability
> and integrity guarantee is required).".
>
> Actually many people are satisfied with just "cp/copy" and I think
> they know that the command doesn't guarantee on the integrity of
> archived files on , say, some disastrous situation like a sudden power
> cut.
>
> However, the use of "test ! -f..." is in a bit different kind of
> suggestion.
>
> https://www.postgresql.org/docs/13/continuous-archiving.html
> | The archive command should generally be designed to refuse to
> | overwrite any pre-existing archive file. This is an important safety
> | feature to preserve the integrity of your archive in case of
> | administrator error (such as sending the output of two different
> | servers to the same archive directory)
>
> This implies that no WAL segment are archived more than once at least
> under any valid operation. Some people are following this suggestion
> to prevent archive from breaking by some *wrong* operations.
>
> > The existing documentation should be taken as purely "this is how the
> > variables which are passed in get expanded" not as "this is what you
> > should do", because it's very much not the latter in any form.
>

- It describes "how archive_command should be like" and showing examples
+ It describes "what archive_command should be like" and showing examples

> among the description implies that the example conforms the
> should-be's.
>
> Nevertheless, the issue here is that there's a case where archiving
> stalls when following the suggestion above under a certain condition.
> Even if it is written premising "set .. archive_mode to on", I don't
> believe that people can surmise that the same archive_command might
- fail when setting archive_mode to always, because the description
- implies
+ fail when setting archive_mode to always.

>
> So I think we need to revise the documentation, or need to *fix* the
> revealed problem that is breaking the assumption of the documentation.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-06-08 04:52:23 Re: Make unlogged table resets detectable
Previous Message Julien Rouhaud 2021-06-08 04:16:48 Re: Hook for extensible parsing.