Crash while recovering database index relation

From: Guy Thornley <guy(at)esphion(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Crash while recovering database index relation
Date: 2004-01-07 09:09:03
Message-ID: 20040107090903.GB18564@conker.esphion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On one of our test boxen here, weve experienced a corrupted file during
database recovery after box power outage. The specific error message is

PANIC: invalid page header in block 6 of relation "17792"

At this point I fired up a hex dumper to inspect the file, and the last
block in the file (which that error refers to) was clearly garbage

This was on postgres 7.4. The system in question is using ReiserFS, and some
journal transactions were replayed on the same boot as the failed postgres
recovery. I beleive this is significant (see below)

By using postgres single-user database server and zero_damaged_pages option
I manged to get the database up again. There were a LOT of relations with
this problem !

It may be significant that this is an index (primary key) for a relation.
ALL of the files with problems were either indexes or primary keys!

I do NOT believe this was a hardware error. What I think happened is:
- postgres extended some indexes
- reiserfs journalled the metadata
- new file contents got buffered by the kernel in memory
- XLog stuff gets fsync()'d
- Power cycle
- reiserfs replayed metadata journal, extended the files
Probably makes the last blocks in each file invalid!
- postgres attempts to recover from its log, and bumps into the (now
garbage) blocks

I'll see if I can get some time to reproduce this reliably

Guy Thornley

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PostgreSQL Bugs List 2004-01-07 13:20:01 BUG #1043: PSQL.exe
Previous Message Seum-Lim Gan 2004-01-06 19:27:15 7.4.1 build error in Solaris