|From:||Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>|
|To:||Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org|
|Subject:||Re: Duplicate history file?|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 2021/05/31 16:58, Kyotaro Horiguchi wrote:
> So, I started a thread for this topic diverged from the following
>> So, what should we do for the user? I think we should put some notes
>> in postgresql.conf or in the documentation. For example, something
>> like this:
> I'm not sure about the exact configuration you have in mind, but that
> would happen on the cascaded standby in the case where the upstream
> promotes. In this case, the history file for the new timeline is
> archived twice. walreceiver triggers archiving of the new history
> file at the time of the promotion, then startup does the same when it
> restores the file from archive. Is it what you complained about?
Thank you for creating a new thread and explaining this.
We are not using cascade replication in our environment, but I think
the situation is similar. As an overview, when I do a promote,
the archive_command fails due to the history file.
I've created a reproduction script that includes building replication,
and I'll share it with you. (I used Robert's test.sh as a reference
for creating the reproduction script. Thanks)
The scenario (sr_test_historyfile.sh) is as follows.
#1 Start pgprimary as a main
#2 Create standby
#3 Start pgstandby as a standby
#4 Execute archive command
#5 Shutdown pgprimary
#6 Start pgprimary as a standby
#7 Promote pgprimary
#8 Execute archive_command again, but failed since duplicate history
file exists (see pgstandby.log)
Note that this may not be appropriate if you consider it as a recovery
procedure for replication configuration. However, I'm sharing it as it is
because this seems to be the procedure used in the customer's environment (PG-REX).
> The same workaround using the alternative archive script works for the
> We could check pg_wal before fetching archive, however, archiving is
> not controlled so strictly that duplicate archiving never happens and
> I think we choose possible duplicate archiving than having holes in
> archive. (so we suggest the "test ! -f" script)
>> Note: If you use archive_mode=always, the archive_command on the
>> standby side should not be used "test ! -f".
> It could be one workaround. However, I would suggest not to overwrite
> existing files (with a file with different content) to protect archive
> from corruption.
> We might need to write that in the documentation...
I think you're right, replacing it with an alternative archive script
that includes the cmp command will resolve the error. The reason is that
I checked with the diff command that the history files are identical.
$ diff -s pgprimary/arc/00000002.history pgstandby/arc/00000002.history
Files pgprimary/arc/00000002.history and pgstandby/arc/00000002.history are identical
Regarding "test ! -f",
I am wondering how many people are using the test command for
archive_command. If I remember correctly, the guide provided by
NTT OSS Center that we are using does not recommend using the test command.
|Next Message||Justin Pryzby||2021-06-01 04:16:35||Re: AWS forcing PG upgrade from v9.6 a disaster|
|Previous Message||Amit Kapila||2021-06-01 04:01:33||Re: Skipping logical replication transactions on subscriber side|