Re: Block-level CRC checks

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2009-12-01 15:35:13
Message-ID: 1259681713.13774.13747.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2009-12-01 at 16:40 +0200, Heikki Linnakangas wrote:

> It's not hard to imagine that when a hardware glitch happens
> causing corruption, it also causes the system to crash. Recalculating
> the CRCs after crash would mask the corruption.

They are already masked from us, so continuing to mask those errors
would not put us in a worse position.

If we are saying that 99% of page corruptions are caused at crash time
because of torn pages on hint bits, then only WAL logging can help us
find the 1%. I'm not convinced that is an accurate or safe assumption
and I'd at least like to see LOG entries showing what happened.

ISTM we could go for two levels of protection. CRC checks and scanner
for Level 1 protection, then full WAL logging for Level 2 protection.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-12-01 15:55:54 Re: Block-level CRC checks
Previous Message Robert Haas 2009-12-01 15:22:03 Re: [HACKERS] Fwd: psql+krb5