Re: Power outage borked things (8.1.10)...

From: Darren Reed <darrenr+postgres(at)fastmail(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Power outage borked things (8.1.10)...
Date: 2008-02-20 19:14:44
Message-ID: 47BC7C24.9040606@fastmail.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Tom Lane wrote:
> Darren Reed <darrenr+postgres(at)fastmail(dot)net> writes:
> > Starting up postgres, I get the log contents below.
> > Is it really as bad as it suggests, namely that I
> > need to recover from backup?
>
> Probably :-(

I've started a new db in parallel with the old data and I'm rebuilding,
and if I can rebuild quicker than I can recover old data, I'll do that.

> pg_resetxlog would let you into the database, but I do not have high
> hopes about the consistency/correctness of what you'll find. The best
> advice would be:
>
> 1. pg_resetxlog

# su postgres -c "/usr/pkg/bin/pg_resetxlog -n /data/db"
pg_control values:

pg_control version number: 812
Catalog version number: 200510211
Database system identifier: 5138205682483264479
Current log file ID: 2
Next log file segment: 103
Latest checkpoint's TimeLineID: 1
Latest checkpoint's NextXID: 4570963
Latest checkpoint's NextOID: 24576
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Maximum data alignment: 4
Database block size: 8192
Blocks per segment of large relation: 131072
Maximum length of identifiers: 64
Maximum columns in an index: 32
Date/time type storage: floating-point numbers
Maximum length of locale name: 128
LC_COLLATE: C
LC_CTYPE: C
# su postgres -c "/usr/pkg/bin/pg_resetxlog -f /data/db"
Transaction log reset
And a start is greated with:
LOG: database system was shut down at 2008-02-20 11:04:51 PST
LOG: checkpoint record is at 2/6A00001C
LOG: redo record is at 2/6A00001C; undo record is at 2/6A00001C;
shutdown TRUE
LOG: next transaction ID: 4570963; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
PANIC: could not access status of transaction 4570963
DETAIL: could not read from file "pg_clog/0004" at offset 90112:
Undefined error: 0
LOG: startup process (PID 29662) was terminated by signal 6
LOG: aborting startup due to startup process failure

>
> 2. pg_dumpall
>
> 3. initdb, restore from backup
>
> 4. compare dump from step 2 to backup dump, apply any changes that seem
> sane
>
> And don't forget
>
> 5. Figure out why a simple power failure was able to do this to you,
> and fix it. The most likely bet is that your disk drives are lying
> about write completion ... see the PG archives for discussion.

Yes, that I can believe.

Darren

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2008-02-20 19:44:49 Re: Power outage borked things (8.1.10)...
Previous Message libra dba 2008-02-20 16:49:18 Re: Failover of the Primary database and starting the standby database in Postgresql in PITR configuraiton?