Unarchived WALs deleted after crash

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Unarchived WALs deleted after crash
Date: 2013-02-14 15:45:42
Message-ID: 511D06A6.8050106@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am facing an unexpected behavior on a 9.2.2 cluster that I can
reproduce on current HEAD.

On a cluster with archive enabled but failing, after a crash of
postmaster, the checkpoint occurring before leaving the recovery mode
deletes any additional WALs, even those waiting to be archived.

Because of this, after recovering from the crash, previous PITR backup
can not be used to restore the instance to a time where archiving was
failing. Any slaves fed by WAL or lagging in SR need to be recreated.

AFAICT, this is not documented and I would expect the WALs to be
archived by the archiver process when the cluster exits the recovery step.

Here is a simple scenario to reproduce this.

Configuration:

wal_level = archive
archive_mode = on
archive_command = '/bin/false'
log_checkpoints = on

Scenario:
createdb test
psql -c 'create table test as select i, md5(i::text) from
generate_series(1,3000000) as i;' test
kill -9 $(head -1 $PGDATA/postmaster.pid)
pg_ctl start

Using this scenario, log files shows:

LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: /bin/false
WARNING: transaction log file "000000010000000000000001" could not
be archived: too many failures
LOG: database system was interrupted; last known up at 2013-02-14
16:12:58 CET
LOG: database system was not properly shut down; automatic recovery
in progress
LOG: crash recovery starts in timeline 1 and has target timeline 1
LOG: redo starts at 0/11400078
LOG: record with zero length at 0/13397190
LOG: redo done at 0/13397160
LOG: last completed transaction was at log time 2013-02-14
16:12:58.49303+01
LOG: checkpoint starting: end-of-recovery immediate
LOG: checkpoint complete: wrote 2869 buffers (17.5%); 0 transaction
log file(s) added, 9 removed, 7 recycled; write=0.023 s, sync=0.468 s,
total=0.739 s; sync files=2, longest=0.426 s, average=0.234 s
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: /bin/false
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: /bin/false
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: /bin/false
WARNING: transaction log file "000000010000000000000011" could not
be archived: too many failures

Before the kill, "000000010000000000000001" was the WAL to archive.
After the kill, the checkpoint deleted 9 files before exiting recovery
mode and "000000010000000000000011" become the first WAL to archive.
"000000010000000000000001" through "000000010000000000000010" were
removed or recycled.

Is it expected ?
--
Jehan-Guillaume de Rorthais
http://www.dalibo.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-02-14 15:52:05 Re: proposal or just idea for psql - show first N rows from relation backslash statement
Previous Message Alvaro Herrera 2013-02-14 15:34:42 Re: proposal or just idea for psql - show first N rows from relation backslash statement