Re: Bug (#3484) - Invalid page header again

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: pgsql-bugs(at)postgresql(dot)org
Cc: alex <an(at)clickware(dot)de>
Subject: Re: Bug (#3484) - Invalid page header again
Date: 2007-12-18 16:24:19
Message-ID: 4767F433.9010700@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Zdenek Kotala wrote:
> alex wrote:
>
> <snip>
>
>> WARNING: relation "transaktion" TID 1240631/12: OID is invalid
>> ERROR: invalid page header in block 1240632 of relation "transaktion"
>> 7. 2007/12/10 : We started the export of the data ( which runs every
>> morning ) for the last days again. These exports use the same
>> SQL-Commands as the automatical run.
>
> Alex,
>
> please can you provide binary dump of these two pages or if there are
> sensitive data try to use pg_filedump to get only page and tuple headers?
>
>

I got dump of affected two blocks from Alex and it seems that both blocks were
overwritten together with some 128bytes length structure (there some pattern)
and complete damaged size is 9728bytes (first block is overwritten completely
and second one only at the beginning), but another buffer from another relation
could be overwritten too.

I think it is more software bug than hardware, because bad data contains some
logic. There is x54 byte which is repeated after each 128 bytes and so on and
most data are zeros.

My suggestion is apply following patch to catch if data are corrupted by
postgreSQL or elsewhere. It should be failed before writing damaged data to the
disk. It is for HEAD but similar patch could be backported.

Index: backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.227
diff -c -r1.227 bufmgr.c
*** backend/storage/buffer/bufmgr.c 15 Nov 2007 21:14:37 -0000 1.227
--- backend/storage/buffer/bufmgr.c 18 Dec 2007 15:50:06 -0000
***************
*** 1734,1739 ****
--- 1734,1741 ----
buf->flags &= ~BM_JUST_DIRTIED;
UnlockBufHdr(buf);

+ if (!PageHeaderIsValid((PageHeader) BufHdrGetBlock(buf)))
+ elog(FATAL, "Buffer cache is damaged!");
smgrwrite(reln,
buf->tag.blockNum,
(char *) BufHdrGetBlock(buf),
***************
*** 1966,1971 ****
--- 1968,1976 ----
errcontext.previous = error_context_stack;
error_context_stack = &errcontext;

+ if (!PageHeaderIsValid((PageHeader)
BufHdrGetBlock(bufHdr)))
+ elog(FATAL, "Buffer cache is damaged!");
+
smgrwrite(rel->rd_smgr,
bufHdr->tag.blockNum,
(char *)
LocalBufHdrGetBlock(bufHdr),
Index: backend/storage/buffer/localbuf.c
===================================================================
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/src/backend/storage/buffer/localbuf.c,v
retrieving revision 1.78
diff -c -r1.78 localbuf.c
*** backend/storage/buffer/localbuf.c 15 Nov 2007 21:14:38 -0000 1.78
--- backend/storage/buffer/localbuf.c 18 Dec 2007 16:05:49 -0000
***************
*** 16,21 ****
--- 16,22 ----
#include "postgres.h"

#include "storage/buf_internals.h"
+ #include"storage/bufpage.h"
#include "storage/bufmgr.h"
#include "storage/smgr.h"
#include "utils/guc.h"
***************
*** 161,166 ****
--- 162,169 ----
oreln = smgropen(bufHdr->tag.rnode);

/* And write... */
+ if (!PageHeaderIsValid((PageHeader) LocalBufHdrGetBlock(bufHdr)))
+ elog(FATAL, "Local buffer cache is damaged!");
smgrwrite(oreln,
bufHdr->tag.blockNum,
(char *) LocalBufHdrGetBlock(bufHdr),

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Zdenek Kotala 2007-12-18 16:30:09 Re: Bug (#3484) - Invalid page header again
Previous Message Tom Lane 2007-12-18 16:03:27 Re: BUG #3824: Query hangs when result set empty using sort and limit