Re: Detecting corrupted pages earlier

From: Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Curt Sampson <cjs(at)cynic(dot)net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detecting corrupted pages earlier
Date: 2003-02-18 20:07:15
Message-ID: 1045598835.3290.2.camel@mouse.copelandconsulting.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2003-02-17 at 22:04, Tom Lane wrote:
> Curt Sampson <cjs(at)cynic(dot)net> writes:
> > On Mon, 17 Feb 2003, Tom Lane wrote:
> >> Postgres has a bad habit of becoming very confused if the page header of
> >> a page on disk has become corrupted.
>
> > What typically causes this corruption?
>
> Well, I'd like to know that too. I have seen some cases that were
> identified as hardware problems (disk wrote data to wrong sector, RAM
> dropped some bits, etc). I'm not convinced that that's the whole story,
> but I have nothing to chew on that could lead to identifying a software
> bug.
>
> > If it's any kind of a serious problem, maybe it would be worth keeping
> > a CRC of the header at the end of the page somewhere.
>
> See past discussions about keeping CRCs of page contents. Ultimately
> I think it's a significant expenditure of CPU for very marginal returns
> --- the layers underneath us are supposed to keep their own CRCs or
> other cross-checks, and a very substantial chunk of the problem seems
> to be bad RAM, against which occasional software CRC checks aren't
> especially useful.

This is exactly why "magic numbers" or simple algorithmic bit patterns
are commonly used. If the "magic number" or bit pattern doesn't match
it's page number accordingly, you know something is wrong. Storage cost
tends to be slightly and CPU overhead low.

I agree with you that a CRC is seems overkill for little return.

Regards,

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2003-02-18 20:43:48 Re: Performance Baseline Script
Previous Message Keith Bottner 2003-02-18 19:48:19 Performance Baseline Script