Stronger safeguard for archive recovery not to miss data

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: "'pgsql-hackers(at)lists(dot)postgresql(dot)org'" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Stronger safeguard for archive recovery not to miss data
Date: 2020-11-26 07:18:39
Message-ID: OSBPR01MB4888CBE1DA08818FD2D90ED8EDF90@OSBPR01MB4888.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

The attached patch is intended to prevent a scenario that
archive recovery hits WALs which come from wal_level=minimal
and the server continues to work, which was discussed in the thread of [1].
The motivation is to protect that user ends up with both getting replica
that could miss data and getting the server to miss data in targeted recovery mode.

About how to modify this, we reached the consensus in the thread.
It is by changing the ereport's level from WARNING to FATAL in CheckRequiredParameterValues().

In order to test this fix, what I did is
1 - get a base backup during wal_level is replica
2 - stop the server and change the wal_level from replica to minimal
3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
4 - stop the server and make the wal_level back to replica
5 - start the server again
6 - execute archive recoveries in both cases
(1) by editing the postgresql.conf and
touching recovery.signal in the base backup from 1th step
(2) by making a replica with standby.signal
* During wal_level is replica, I enabled archive_mode in this test.

First of all, I confirmed the server started up without this patch.
After applying this safeguard patch, I checked that
the server cannot start up any more in the scenario case.
I checked the log and got the result below with this patch.

2020-11-26 06:49:46.003 UTC [19715] FATAL: WAL was generated with wal_level=minimal, data may be missing
2020-11-26 06:49:46.003 UTC [19715] HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.

Lastly, this should be backpatched.
Any comments ?

[1]
https://www.postgresql.org/message-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320%40TYAPR01MB2990.jpnprd01.prod.outlook.com

Best,
Takamichi Osumi

Attachment Content-Type Size
stronger_safeguard_for_archive_recovery.patch application/octet-stream 1.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-11-26 07:18:55 Re: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Luc Vlaming 2020-11-26 07:11:22 Re: Parallel plans and "union all" subquery