Re: [BUG] non archived WAL removed during production crash recovery

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: jgdr(at)dalibo(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Date: 2020-04-27 23:01:38
Message-ID: 20200427230138.GB169337@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Apr 27, 2020 at 06:21:07PM +0900, Kyotaro Horiguchi wrote:
> Agreed to the diagnosis and the fix. The fix reliably cause a restart
> point then the restart point manipulats the status files the right way
> before the CHECKPOINT command resturns, in the both cases.

Thanks for checking!

> If I would add something to the fix, the following line may need a
> comment.
>
> +# Wait for the checkpoint record is replayed so that the following
> +# CHECKPOINT causes a restart point reliably.
> |+$standby1->poll_query_until('postgres',
> |+ qq{ SELECT pg_wal_lsn_diff(pg_last_wal_replay_lsn(), '$primary_lsn') >= 0 }

Makes sense, added a comment and applied to HEAD. I have also
improved the comment around the split with pg_switch_wal(), and
actually simplified the test to use as wait point the return value
from the function.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2020-04-27 23:58:47 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Tom Lane 2020-04-27 16:40:52 Re: BUG #16396: Parallel Union queries seem to treat NULL values differently

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-04-27 23:40:07 Re: [HACKERS] Restricting maximum keep segments by repslots
Previous Message Alvaro Herrera 2020-04-27 22:33:42 Re: [HACKERS] Restricting maximum keep segments by repslots