Re: Stronger safeguard for archive recovery not to miss data

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "laurenz(dot)albe(at)cybertec(dot)at" <laurenz(dot)albe(at)cybertec(dot)at>
Subject: Re: Stronger safeguard for archive recovery not to miss data
Date: 2021-04-01 08:25:04
Message-ID: da4623a8-2cf2-9496-7c21-7085d2be7bb4@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/04/01 12:45, osumi(dot)takamichi(at)fujitsu(dot)com wrote:
> Thank you for sharing your ideas about the hint. Absolutely need to change the message.
> In my opinion, combining the basic idea of yours and Fujii-san's would be the best.
>
> Updated the patch and made v05. The changes I made are
>
> * rewording of errhint although this has become long !
> * fix of the typo in the TAP test
> * modification of my past changes not to change conditions in CheckRequiredParameterValues
> * rename of the test file to 024_archive_recovery.pl because two files are made
> since the last update of this patch
> * pgindent is conducted to check my alignment again.

Thanks for updating the patch!

+ errhint("Use a backup taken after setting wal_level to higher than minimal "
+ "or recover to the point in time before wal_level becomes minimal even though it causes data loss")));

ISTM that "or recover to the point in time before wal_level was changed
to minimal even though it may cause data loss" sounds better. Thought?

+# Check if standby.signal exists
+my $pgdata = $new_node->data_dir;
+ok (-f "${pgdata}/standby.signal", 'standby.signal was created');

+# Check if recovery.signal exists
+my $path = $another_node->data_dir;
+ok (-f "${path}/recovery.signal", 'recovery.signal was created');

Why are these tests necessary?
These seem to test that init_from_backup() works expectedly based on
the parameter "standby". But if we are sure that init_from_backup() works fine,
these tests don't seem to be necessary.

+use Config;

This is not necessary?

+# Make the wal_level back to replica
+$node->append_conf('postgresql.conf', $replica_config);
+$node->restart;
+check_wal_level('replica', 'wal_level went back to replica again');

IMO it's better to comment why this server restart is necessary.
As far as I understand correctly, this is necessary to ensure
the WAL file containing the record about the change of wal_level
(to minimal) is archived, so that the subsequent archive recovery
will be able to replay it.


> By the way, when I build postgres with this patch and enable-coverage option,
> the results of RT becomes unstable. Does someone know the reason ?
> When it fails, I get stderr like below

I have no idea about this. Does this happen even without the patch?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2021-04-01 08:46:42 Re: pgbench - add pseudo-random permutation function
Previous Message YoungHwan Joo 2021-04-01 08:09:28 [GSoC 2021 Proposal] Develop Performance Farm Benchmarks and Website