Skip site navigation (1) Skip section navigation (2)

Re: Block-level CRC checks

From: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
To: Decibel! <decibel(at)decibel(dot)org>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2008-09-30 22:49:17
Message-ID: DC9A3FF3-F056-449A-926F-CD1F57740042@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers

On 30 Sep 2008, at 10:17 PM, Decibel! <decibel(at)decibel(dot)org> wrote:

> On Sep 30, 2008, at 1:48 PM, Heikki Linnakangas wrote:
>> This has been suggested before, and the usual objection is  
>> precisely that it only protects from errors in the storage layer,  
>> giving a false sense of security.
>
> If you can come up with a mechanism for detecting non-storage errors  
> as well, I'm all ears. :)
>
> In the meantime, you're way, way more likely to experience  
> corruption at the storage layer than anywhere else.

Fwiw this hasn't been my experience. Bad memory is extremely common  
and even the storage failures I've seen (excluding the drive crashes)  
turned out to actually be caused by bad memory.

That said I've always been interested in doing this. The main use case  
in my mind has actually been for data that's been restored from old  
backups which have been lying round and floating between machines for  
a while with many opportunities for bit errors to show up.


The main stumbling block I ran into was how to deal with turning the  
option off and on. I wanted it to be possible to turn off the option  
to have the database ignore any errors and to avoid the overhead.

But that means including an escape hatch value which is always  
considered to be correct. But that dramatically reduces the  
effectiveness of the scheme.

Another issue is it will make space available on each page smaller  
making it harder to do in place upgrades.


If you can deal with those issues and carefully deal with the  
contingencies so it's clear to people what to do when errra occur or  
they want to turn the feature on or off then I'm all for it. That  
despite my experience of memory errors being a lot more common than  
undetected storage errors. 

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2008-09-30 22:52:15
Subject: WAL recovery is broken by FSM patch
Previous:From: Decibel!Date: 2008-09-30 21:37:48
Subject: Bad error message

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group