Re: WAL archive is lost

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL archive is lost
Date: 2019-11-22 19:44:40
Message-ID: 20191122194440.ozgagnlommg6rtmz@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 22, 2019 at 05:31:55AM +0000, matsumura(dot)ryo(at)fujitsu(dot)com wrote:
>Hi all
>
>I find a situation that WAL archive file is lost but any WAL segment file is not lost.
>It causes for archive recovery to fail. Is this behavior a bug?
>
>example:
>
> WAL segment files
> 000000010000000000000001
> 000000010000000000000002
> 000000010000000000000003
>
> Archive files
> 000000010000000000000001
> 000000010000000000000003
>
> Archive file 000000010000000000000002 is lost but WAL segment files
> is continuous. Recovery with archive (i.e. PITR) stops at the end of
> 000000010000000000000001.
>
>How to reproduce:
>- Set up replication (primary and standby).
>- Set [archive_mode = always] in standby.
>- WAL receiver exits (i.e. because primary goes down)
> after receiver inserts the last record in some WAL segment file
> before receiver notifies the segement file to archiver(create .ready file).
>
>Even if WAL receiver restarts, the WAL segment file is not notified to
>archiver.
>

That does indeed seem like a bug. We should certainly archive all WAL
segments, irrespectedly of primary shutdowns/restarts/whatever. I guess
we should make sure the archiver is properly notified befor ethe exit.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-11-22 20:07:41 Re: [PATCH][BUG FIX] Pointer arithmetic with NULL
Previous Message Mark Dilger 2019-11-22 19:32:48 Re: Assertion failing in master, predicate.c