Re: [GENERAL] Undetected corruption of table files

From: Decibel! <decibel(at)decibel(dot)org>
To: Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>
Cc: Jan Wieck *EXTERN* <JanWieck(at)Yahoo(dot)com>, Tom Lane *EXTERN* <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GENERAL] Undetected corruption of table files
Date: 2007-08-31 15:49:08
Message-ID: 20070831154908.GW38801@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Aug 31, 2007 at 02:34:09PM +0200, Albe Laurenz wrote:
> I have thought some more about it, and tend to agree now:
> Checksums will only detect disk failure, and that's only
> one of the many integrity problems that can happen.
> And one that can be reduced to a reasonable degree with good
> storage systems.
>
> So the benefit of checksums is not enough to bother.

Uhm... how often do we get people asking about corruption on -admin
alone? 2-3x a month? ISTM it would be very valuable to those folks to
be able to tell them if the corruption occurred between writing a page
out and reading it back in.

Even if we don't care about folks running on suspect hardware, having a
CRC would make it far more reasonable to recommend full_page_writes=off.
I never turn that off and recommend to folks that they don't turn it off
because there's no way to know if it will or has corrupted data.

BTW, a method that would buy additional protection would be to compute
the CRC for a page every time you modify it in such a way that generates
a WAL record, and record that CRC with the WAL record. That would
protect from corruption that happened anytime after the page was
modified, instead of just when smgr went to write it out. How useful
that is I don't know...
--
Decibel!, aka Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Erik Jones 2007-08-31 15:53:03 Re: URGENT: Whole DB down ("no space left on device")
Previous Message Tom Lane 2007-08-31 15:07:20 Re: Out of shared memory (locks per process) using table-inheritance style partitioning

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2007-08-31 16:00:58 Re: Password requirement in windows installer
Previous Message Decibel! 2007-08-31 15:39:19 Re: Password requirement in windows installer