Re: Stronger safeguard for archive recovery not to miss data

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: osumi(dot)takamichi(at)fujitsu(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Stronger safeguard for archive recovery not to miss data
Date: 2020-11-26 07:28:40
Message-ID: 20201126.162840.1665375222523010434.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 26 Nov 2020 07:18:39 +0000, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> wrote in
> Hello
>
>
> The attached patch is intended to prevent a scenario that
> archive recovery hits WALs which come from wal_level=minimal
> and the server continues to work, which was discussed in the thread of [1].
> The motivation is to protect that user ends up with both getting replica
> that could miss data and getting the server to miss data in targeted recovery mode.
>
> About how to modify this, we reached the consensus in the thread.
> It is by changing the ereport's level from WARNING to FATAL in CheckRequiredParameterValues().
>
> In order to test this fix, what I did is
> 1 - get a base backup during wal_level is replica
> 2 - stop the server and change the wal_level from replica to minimal
> 3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
> 4 - stop the server and make the wal_level back to replica
> 5 - start the server again
> 6 - execute archive recoveries in both cases
> (1) by editing the postgresql.conf and
> touching recovery.signal in the base backup from 1th step
> (2) by making a replica with standby.signal
> * During wal_level is replica, I enabled archive_mode in this test.
>
> First of all, I confirmed the server started up without this patch.
> After applying this safeguard patch, I checked that
> the server cannot start up any more in the scenario case.
> I checked the log and got the result below with this patch.
>
> 2020-11-26 06:49:46.003 UTC [19715] FATAL: WAL was generated with wal_level=minimal, data may be missing
> 2020-11-26 06:49:46.003 UTC [19715] HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.
>
> Lastly, this should be backpatched.
> Any comments ?

Perhaps we need the TAP test that conducts the above steps.

> [1]
> https://www.postgresql.org/message-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320%40TYAPR01MB2990.jpnprd01.prod.outlook.com

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2020-11-26 07:29:22 Re: Add table access method as an option to pgbench
Previous Message Fujii Masao 2020-11-26 07:21:39 Re: [doc] plan invalidation when statistics are update