Re: Block-level CRC checks

From: decibel <decibel(at)decibel(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndQuadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Aidan Van Dyk" <aidan(at)highrise(dot)ca>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2009-12-01 22:15:06
Message-ID: E35E61B0-86AC-45B2-9C79-DD54143E7990@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Dec 1, 2009, at 1:39 PM, Kevin Grittner wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> And a lot of our biggest users are having issues; it seems pretty
>> much guarenteed that if you have more than 20 postgres servers, at
>> least one of them will have bad memory, bad RAID and/or a bad
>> driver.
>
> Huh?!? We have about 200 clusters running on about 100 boxes, and
> we see that very rarely. On about 100 older boxes, relegated to
> less critical tasks, we see a failure maybe three or four times per
> year. It's usually not subtle, and a sane backup and redundant
> server policy has kept us from suffering much pain from these. I'm
> not questioning the value of adding features to detect corruption,
> but your numbers are hard to believe.

That's just your experience. Others have had different experiences.

And honestly, bickering about exact numbers misses Josh's point
completely. Postgres is seriously lacking in it's ability to detect
hardware problems, and hardware *does fail*. And you can't just
assume that when it fails it blows up completely.

We really do need some capability for detecting errors.

>> The problem I have with CRC checks is that it only detects bad
>> I/O, and is completely unable to detect data corruption due to bad
>> memory. This means that really we want a different solution which
>> can detect both bad RAM and bad I/O, and should only fall back on
>> CRC checks if we're unable to devise one.
>
> md5sum of each tuple? As an optional system column (a la oid)

That's a possibility.

As Josh mentioned, some people will pay a serious performance hit to
ensure that their data is safe and correct. The CRC proposal was
intended as a middle of the road approach that would at least tell
you that your hardware was probably OK. There's certainly more that
could be done.

Also, I think some means of detecting torn pages would be very
welcome. If this was done at the storage manager level it would
probably be fairly transparent to the rest of the code.

>> checking data format for readable pages and tuples (and index
>> nodes) both before and after write to disk
>
> Given that PostgreSQL goes through the OS, and many of us are using
> RAID controllers with BBU RAM, how do you do a read with any
> confidence that it came from the disk? (I mean, I know how to do
> that for a performance test, but as a routine step during production
> use?)

You'd probably need to go to some kind of stand-alone or background
process that slowly reads and verifies the entire database.
Unfortunately at that point you could only detect corruption and not
correct it, but it'd still be better than nothing.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-12-01 22:15:38 Re: Page-level version upgrade (was: Block-level CRC checks)
Previous Message Tom Lane 2009-12-01 22:14:25 Re: [CORE] EOL for 7.4?