Re: Undetected corruption of table files

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at>, <pgsql-general(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Undetected corruption of table files
Date: 2007-08-27 15:50:06
Message-ID: 87lkbxf041.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> "Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at> writes:
>> Tom Lane wrote:
>>>> Would it be an option to have a checksum somewhere in each
>>>> data block that is verified upon read?
>
>>> That's been proposed before and rejected before. See the archives ...
>
>> I searched for "checksum" and couldn't find it. Could someone
>> give me a pointer? I'm not talking about WAL files here.
>
> "CRC" maybe? Also, make sure your search goes all the way back; I think
> the prior discussions were around the same time WAL was initially put
> in, and/or when we dropped the WAL CRC width from 64 to 32 bits.
> The very measurable overhead of WAL CRCs are the main thing that's
> discouraged us from having page CRCs. (Well, that and the lack of
> evidence that they'd actually gain anything.)

I thought we determined the reason WAL CRCs are expensive is because we have
to checksum each WAL record individually. I recall the last time this came up
I ran some microbenchmarks and found that the cost to CRC an entire 8k block
was on the order of tens of microseconds.

The last time it came up was in the context of allowing turning off
full_page_writes but offering a guarantee that torn pages would be detected on
recovery and no later. I was a proponent of using writev to embed bytes in
each 512 byte block and Jonah said it would be no faster than a CRC (and
obviously considerably more complicated). My benchmarks showed that Jonah was
right and the CRC was cheaper than a the added cost of using writev.

I do agree the benefits of having a CRC are overstated. Most times corruption
is caused by bad memory and a CRC will happily checksum the corrupted memory
just fine. A checksum is no guarantee. But I've also seen data corruption
caused by bad memory in an i/o controller, for example. There are always going
to be cases where it could help.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2007-08-27 15:58:24 Re: [HACKERS] Undetected corruption of table files
Previous Message Trevor Talbot 2007-08-27 15:48:19 Re: [HACKERS] Undetected corruption of table files

Browse pgsql-hackers by date

  From Date Subject
Next Message Alban Hertroys 2007-08-27 15:58:24 Re: [HACKERS] Undetected corruption of table files
Previous Message Tom Lane 2007-08-27 15:48:55 Re: [WIP PATCH] Lazily assign xids for toplevel Transactions