RE: Stronger safeguard for archive recovery not to miss data

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Fujii Masao' <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "laurenz(dot)albe(at)cybertec(dot)at" <laurenz(dot)albe(at)cybertec(dot)at>
Subject: RE: Stronger safeguard for archive recovery not to miss data
Date: 2021-04-04 02:58:43
Message-ID: OSBPR01MB488886D6E30B32428C08E766ED789@OSBPR01MB4888.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, April 1, 2021 5:25 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> On 2021/04/01 12:45, osumi(dot)takamichi(at)fujitsu(dot)com wrote:
> > Thank you for sharing your ideas about the hint. Absolutely need to change
> the message.
> > In my opinion, combining the basic idea of yours and Fujii-san's would be
> the best.
> >
> > Updated the patch and made v05. The changes I made are
> >
> > * rewording of errhint although this has become long !
> > * fix of the typo in the TAP test
> > * modification of my past changes not to change conditions in
> > CheckRequiredParameterValues
> > * rename of the test file to 024_archive_recovery.pl because two files are
> made
> > since the last update of this patch
> > * pgindent is conducted to check my alignment again.
>
> Thanks for updating the patch!
>
> + errhint("Use a backup taken after setting
> wal_level to higher than minimal "
> + "or recover to the point in
> time before wal_level becomes
> +minimal even though it causes data loss")));
>
> ISTM that "or recover to the point in time before wal_level was changed
> to minimal even though it may cause data loss" sounds better. Thought?
Adopted.

> +# Check if standby.signal exists
> +my $pgdata = $new_node->data_dir;
> +ok (-f "${pgdata}/standby.signal", 'standby.signal was created');
>
> +# Check if recovery.signal exists
> +my $path = $another_node->data_dir;
> +ok (-f "${path}/recovery.signal", 'recovery.signal was created');
>
> Why are these tests necessary?
> These seem to test that init_from_backup() works expectedly based on the
> parameter "standby". But if we are sure that init_from_backup() works fine,
> these tests don't seem to be necessary.
Absolutely, you are right. Fixed.

> +use Config;
>
> This is not necessary?
Removed.

> +# Make the wal_level back to replica
> +$node->append_conf('postgresql.conf', $replica_config); $node->restart;
> +check_wal_level('replica', 'wal_level went back to replica again');
>
> IMO it's better to comment why this server restart is necessary.
> As far as I understand correctly, this is necessary to ensure the WAL file
> containing the record about the change of wal_level (to minimal) is archived,
> so that the subsequent archive recovery will be able to replay it.
OK, added some comments. Further, I felt the way I wrote this part was not good at all and self-evident
and developers who read this test would feel uneasy about that point.
So, a little bit fixed that test so that we can get clearer conviction for wal archive.

> > By the way, when I build postgres with this patch and enable-coverage
> > option, the results of RT becomes unstable. Does someone know the
> reason ?
> > When it fails, I get stderr like below
>
> I have no idea about this. Does this happen even without the patch?
Unfortunately, no. I get this only with --enable-coverage and with my patch,
althought regression tests have passed with this patch.
OSS HEAD doesn't produce the stderr even with --enable-coverage.

Best Regards,
Takamichi Osumi

Attachment Content-Type Size
stronger_safeguard_for_archive_recovery_v06.patch application/octet-stream 6.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2021-04-04 03:07:16 RE: Stronger safeguard for archive recovery not to miss data
Previous Message Zhihong Yu 2021-04-04 02:47:03 Unused variable found in AttrDefaultFetch