Re: Enabling Checksums

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Markus Wanner <markus(at)bluegap(dot)ch>, Jesper Krogh <jesper(at)krogh(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Enabling Checksums
Date: 2012-11-10 13:46:44
Message-ID: 6363139D-2757-4A78-993D-66F39F478D80@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Nov10, 2012, at 00:08 , Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Fri, 2012-11-09 at 20:48 +0100, Markus Wanner wrote:
>> Given your description of option 2 I was under the impression that each
>> page already has a bit indicating whether or not the page is protected
>> by a checksum. Why do you need more bits than that?
>
> The bit indicating that a checksum is present may be lost due to
> corruption.

Though that concern mostly goes away if instead of a separate bit we use a
special checksum value, say 0xDEAD, to indicate that the page isn't
checksummed, no?

If checksums were always enabled, the probability of a random corruption
going undetected is N/N^2 = 1/N where N is the number of distinct checksum
values, since out of the N^2 equally likely pairs of computed and stored
checksums values, N show two identical values.

With the 0xDEAD-scheme, the probability of a random corruption going
undetected is (N-1 + N)/N^2 = 2/N - 1/N^2, since there are (N-1) pairs
with identical values != 0xDEAD, and N pairs where the stored checksum
value is 0xDEAD.

So instead of a 1 in 65536 chance of a corruption going undetected, the
0xDEAD-schema gives (approximately) a chance of 1 in 32768, i.e the
strength of the checksum is reduced by one bit. That's still acceptable,
I'd say.

In practice, 0xDEAD may be a bad choice because of it's widespread use
as an uninitialized marker for blocks of memory. A randomly picked value
would probably be a better choice.

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-11-10 15:54:55 Re: Incorrect behaviour when using a GiST index on points
Previous Message Chen Huajun 2012-11-10 09:46:24 Re: Fix errcontext() function