Re: Stronger safeguard for archive recovery not to miss data

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Stronger safeguard for archive recovery not to miss data
Date: 2021-01-20 12:02:51
Message-ID: 0cf55744297b3028485ecb9933c4256bd2985fe8.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2021-01-20 at 13:10 +0900, Fujii Masao wrote:
> + errhint("Run recovery again from a new base backup taken after setting wal_level higher than minimal")));
>
> Isn't it impossible to do this in normal archive recovery case? In that case,
> since the server crashed and the database got corrupted, probably
> we cannot take a new base backup.
>
> Originally even when users accidentally set wal_level to minimal, they could
> start the server from the backup by disabling hot_standby and salvage the data.
> But with the patch, we lost the way to do that. Right? I was wondering that
> WARNING was used intentionally there for that case.

I would argue that if you set your "wal_level" to minimal, do some work,
reset it to "replica" and recover past that, two things could happen:

1. Since "archive_mode" was off, you are missing some WAL segments and
cannot recover past that point anyway.

2. You are missing some relevant WAL records, and your recovered
database is corrupted. This is very likely, because you probably
switched to "minimal" with the intention to generate less WAL.

Now I see your point that a corrupted database may be better than no
database at all, but wouldn't you agree that a warning in the log
(that nobody reads) is too little information?

Normally we don't take such a relaxed attitude towards data corruption.

Perhaps there could be a GUC "recovery_allow_data_corruption" to
override this check, but I'd say that a warning is too little.

> if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
> ereport(ERROR,
> (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"),
> errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
>
> With the patch, we never reach the above code?

Right, that would have to go. I didn't notice that.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2021-01-20 12:18:35 Re: TOAST condition for column size
Previous Message Daniel Gustafsson 2021-01-20 11:58:13 Re: Support for NSS as a libpq TLS backend