Re: history file on replica and double switchover

From: David Zhang <david(dot)zhang(at)highgo(dot)ca>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: history file on replica and double switchover
Date: 2020-09-24 23:15:17
Message-ID: 1c1a7d07-272b-7ced-1a1c-9a16aff33d54@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

My understanding is that the "archiver" won't even start if
"archive_mode = on" has been set on a "replica". Therefore, either
(XLogArchiveMode == ARCHIVE_MODE_ALWAYS) or (XLogArchiveMode !=
ARCHIVE_MODE_OFF) will produce the same result.

Please see how the "archiver" is started in
src/backend/postmaster/postmaster.c

5227                 /*
5228                  * Start the archiver if we're responsible for
(re-)archiving received
5229                  * files.
5230                  */
5231                 Assert(PgArchPID == 0);
5232                 if (XLogArchivingAlways())
5233                         PgArchPID = pgarch_start();

I did run the nice script "double_switchover.sh" using either "always"
or "on" on patch v1 and v2. They all generate the same results below. In
other words, whether history file (00000003.history) is archived or not
depends on "archive_mode" settings.

echo "archive_mode = always" >> ${PGDATA_NODE2}/postgresql.auto.conf

$ ls -l archive
-rw------- 1 david david 16777216 Sep 24 14:40 000000010000000000000002
... ...
-rw------- 1 david david 16777216 Sep 24 14:40 00000002000000000000000A
-rw------- 1 david david       41 Sep 24 14:40 00000002.history
-rw------- 1 david david       83 Sep 24 14:40 00000003.history

echo "archive_mode = on" >> ${PGDATA_NODE2}/postgresql.auto.conf

$ ls -l archive
-rw------- 1 david david 16777216 Sep 24 14:47 000000010000000000000002
... ...
-rw------- 1 david david 16777216 Sep 24 14:47 00000002000000000000000A
-rw------- 1 david david       41 Sep 24 14:47 00000002.history

Personally, I prefer patch v2 since it allies to the statement here,
https://www.postgresql.org/docs/13/warm-standby.html#CONTINUOUS-ARCHIVING-IN-STANDBY

"If archive_mode is set to on, the archiver is not enabled during
recovery or standby mode. If the standby server is promoted, it will
start archiving after the promotion, but will not archive any WAL it did
not generate itself."

By the way, I think the last part of the sentence should be changed to
something like below:

"but will not archive any WAL, which was not generated by itself."

Best regards,

David

On 2020-09-17 10:18 a.m., Fujii Masao wrote:
>
>
> On 2020/09/04 13:53, Fujii Masao wrote:
>>
>>
>> On 2020/09/04 8:29, Anastasia Lubennikova wrote:
>>> On 27.08.2020 16:02, Grigory Smolkin wrote:
>>>> Hello!
>>>>
>>>> I`ve noticed, that when running switchover replica to master and
>>>> back to replica, new history file is streamed to replica, but not
>>>> archived,
>>>> which is not great, because it breaks PITR if archiving is running
>>>> on replica. The fix looks trivial.
>>>> Bash script to reproduce the problem and patch are attached.
>>>>
>>> Thanks for the report. I agree that it looks like a bug.
>>
>> +1
>>
>> +            /* Mark history file as ready for archiving */
>> +            if (XLogArchiveMode != ARCHIVE_MODE_OFF)
>> +                XLogArchiveNotify(fname);
>>
>> I agree that history file should be archived in the standby when
>> archive_mode=always. But why do we need to do when archive_mode=on?
>> I'm just concerned about the case where the primary and standby
>> have the shared archive area, and archive_mode is on.
>
> So I updated the patch so that walreceiver marks the streamed history
> file
> as ready for archiving only when archive_mode=always. Patch attached.
> Thought?
>
> Regards,
>
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-09-24 23:47:20 Re: Handing off SLRU fsyncs to the checkpointer
Previous Message Alexander Korotkov 2020-09-24 23:02:57 Re: Fix inconsistency in jsonpath .datetime()