Re: Unarchived WALs deleted after crash

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unarchived WALs deleted after crash
Date: 2013-02-14 20:04:25
Message-ID: CAAZKuFYZsWg_rjv4ayC-rQGaVWRhDg5EU41nGqni+eC+wz0EdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 14, 2013 at 7:45 AM, Jehan-Guillaume de Rorthais
<jgdr(at)dalibo(dot)com> wrote:
> Hi,
>
> I am facing an unexpected behavior on a 9.2.2 cluster that I can
> reproduce on current HEAD.
>
> On a cluster with archive enabled but failing, after a crash of
> postmaster, the checkpoint occurring before leaving the recovery mode
> deletes any additional WALs, even those waiting to be archived.

I believe I have encountered this recently, but didn't get enough
chance to work with it to correspond. For me, the cause was
out-of-disk on the file system that exclusively contained WAL,
backlogged because archiving fell behind writing. This causes the
cluster to crash -- par for the course -- but also an archive gap was
created. At the time I thought there was some kind of bug in dealing
with out of space issues in the archiver (the .ready bookkeeping), but
the symptoms I saw seem like they might be explained by your report,
too.

--
fdr

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-02-14 20:05:41 Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Previous Message Alvaro Herrera 2013-02-14 19:51:53 Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system