Re: Block-level CRC checks

From: Decibel! <decibel(at)decibel(dot)org>
To: pgsql(at)mohawksoft(dot)com
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2008-09-30 21:10:58
Message-ID: 2220D58D-DD72-4F17-8DF2-E2ACD25CA774@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sep 30, 2008, at 2:17 PM, pgsql(at)mohawksoft(dot)com wrote:
>> A customer of ours has been having trouble with corrupted data for
>> some
>> time. Of course, we've almost always blamed hardware (and we've seen
>> RAID controllers have their firmware upgraded, among other
>> actions), but
>> the useful thing to know is when corruption has happened, and where.
>
> That is an important statement, to know when it happens not
> necessarily to
> be able to recover the block or where in the block it is corrupt.
> Is that
> correct?

Oh, correcting the corruption would be AWESOME beyond belief! But at
this point I'd settle for just knowing it had happened.

>> So we've been tasked with adding CRCs to data files.
>
> CRC or checksum? If the objective is merely general "detection" there
> should be some latitude in choosing the methodology for performance.

See above. Perhaps the best win would be a case where you could
choose which method you wanted. We generally have extra CPU on the
servers, so we could afford to burn some cycles with more complex
algorithms.

>> The idea is that these CRCs are going to be checked just after
>> reading
>> files from disk, and calculated just before writing it. They are
>> just a protection against the storage layer going mad; they are not
>> intended to protect against faulty RAM, CPU or kernel.
>
> It will actually find faults in all if it. If the CPU can't add and/
> or a
> RAM location lost a bit, this will blow up just as easily as a bad
> block.
> It may cause "false identification" of an error, but it will keep a
> bad
> system from hiding.

Well, very likely not, since the intention is to only compute the CRC
when we write the block out, at least for now. In the future I would
like to be able to detect when a CPU or memory goes bonkers and poops
on something, because that's actually happened to us as well.

>> The implementation I'm envisioning requires the use of a new relation
>> fork to store the per-block CRCs. Initially I'm aiming at a CRC32
>> sum
>> for each block. FlushBuffer would calculate the checksum and
>> store it
>> in the CRC fork; ReadBuffer_common would read the page, calculate the
>> checksum, and compare it to the one stored in the CRC fork.
>
> Hell, all that is needed is a long or a short checksum value in the
> block.
> I mean, if you just want a sanity test, it doesn't take much. Using a
> second relation creates confusion. If there is a CRC discrepancy
> between
> two different blocks, who's wrong? You need a third "control" to
> know. If
> the block knows its CRC or checksum and that is in error, the block is
> bad.

I believe the idea was to make this as non-invasive as possible. And
it would be really nice if this could be enabled without a dump/
reload (maybe the upgrade stuff would make this possible?)
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua Drake 2008-09-30 21:11:36 Re: Block-level CRC checks
Previous Message Jeffrey Baker 2008-09-30 20:48:52 Re: Block-level CRC checks