Skip site navigation (1) Skip section navigation (2)

Re: invalid page header

From: Chris Travers <chris(at)metatrontech(dot)com>
To: Jo De Haes <jo(dot)de_nospam_haes(at)indicator(dot)be>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: invalid page header
Date: 2006-03-29 16:18:59
Message-ID: 442AB373.1020004@metatrontech.com (view raw or flat)
Thread:
Lists: pgsql-general
Jo De Haes wrote:

> OK.  The saga continues, everything is a little bit more clear, but at 
> the same time a lot more confusing.
>
> Today i wanted to reproduce the problem again.  And guess what? A 
> vacuum of the database went thru without any problems.
>
> I dump the block i was having problems with yesterday.  It doesn't 
> report an invalid header anymore and it contains other data!!!
>
Inconsistant problems esp. with PostgreSQL are usually the result of 
hardware failure. 

> Turns out the data that was returned yesterday belongs to another 
> database!
>
> Some more detail about the setup.  This server runs 2 instances of 
> postgresql.  One production instance which is version 8.0.3.  And 
> another testing instance installed in a different folder which runs 
> version 8.1.3  Am I wrong thinking this setup ought to work?

No.  Ihave done it before too.  PostgreSQL instances running on 
different ports or addresses are sufficiently isolated to prevent this 
from being a problem.

>
> Both instances use completely seperated data folders.
>
> So the first dump returned data that actually belongs to an 8.0.3 
> database (that runs fine).  And today without _any_ intervention that 
> same block returns the correct data and the complete database is fine.
>
> Where is the problem?
>     The fact that i'm running 2 different instances?
>     Cache on raid controller messing up?
>     Some strange voodoo?

I would see what sort of memory testing suite you can run on your system 
first (memtestx86, for example) and go from there.  It sounds to me like 
some sort of a hardware issue.  It *could* be bits flipped anywhere, 
from the writehead on the disk to the main system memory or the CPU.

The likelihood that it is a random RAM error is reduced if you are using 
ECC RAM.  Otherwise it could be anything.

This being said, when I have seen bits flipped by the CPU usually you 
get a lot of index issues and shared memory corruptions, so I would be 
more inclined to think that this was RAM or RAID cache.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Attachment: chris.vcf
Description: text/x-vcard (171 bytes)

In response to

Responses

pgsql-general by date

Next:From: Seneca CunninghamDate: 2006-03-29 16:22:11
Subject: Re: More AIX 5.3 fun - out of memory ?
Previous:From: Emi LuDate: 2006-03-29 16:13:16
Subject: Getting more information about errorcodes such as when these error1 happen

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group