Re: Checksums by default?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Checksums by default?
Date: 2017-01-22 09:17:16
Message-ID: 2842f1c9-bd7d-0eec-a24a-3d795c2cead2@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/21/2017 05:35 PM, Tom Lane wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>>> Have we seen *even one* report of checksums catching problems in
>>> auseful way?
>
>> This isn't the right question.
>
> I disagree. If they aren't doing something useful for people who
> have turned them on, what's the reason to think they'd do something
> useful for the rest?
>

I believe Stephen is right. The fact that you don't see something, e.g.
reports about checksums catching something in production deployments,
proves nothing because of "survivorship bias" discovered by Abraham Wald
during WWW II [1]. Not seeing bombers with bullet holes in engines does
not mean you don't need to armor engines. Quite the opposite.

[1]
https://medium.com/@penguinpress/an-excerpt-from-how-not-to-be-wrong-by-jordan-ellenberg-664e708cfc3d#.j9d9c35mb

Applied to checksums, we're quite unlikely to see reports about data
corruption caught by checksums because "ERROR: invalid page in block X"
is such a clear sign of data corruption that people don't even ask us
about that. Combine that with the fact that most people are running with
defaults (i.e. no checksums) and that data corruption is a rare event by
nature, and we're bound to have no such reports.

What we got, however, are reports about strange errors from instances
without checksums enabled, that were either determined to be data
corruption, or disappeared after dump/restore or reindexing. It's hard
to say for sure whether those were cases of data corruption (where
checksums might have helped) or some other bug (resulting in a corrupted
page with the checksum computed on the corrupted page).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-01-22 09:57:22 Re: patch: function xmltable
Previous Message Michael Paquier 2017-01-22 08:47:55 Re: [COMMITTERS] pgsql: Add function to import operating system collations