Re: Checksums by default?

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checksums by default?
Date: 2017-01-21 18:37:26
Message-ID: 20170121183726.GP18360@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > Because I see having checksums as, frankly, something we always should
> > have had (as most other databases do, for good reason...) and because
> > they will hopefully prevent data loss. I'm willing to give us a fair
> > bit to minimize the risk of losing data.
>
> To be perfectly blunt, that's just magical thinking. Checksums don't
> prevent data loss in any way, shape, or form. In fact, they can *cause*
> data loss, or at least make it harder for you to retrieve your data,
> in the event of bugs causing false-positive checksum failures.

This is not a new argument, at least to me, and I don't agree with it.

> What checksums can do for you, perhaps, is notify you in a reasonably
> timely fashion if you've already lost data due to storage-subsystem
> problems. But in a pretty high percentage of cases, that fact would
> be extremely obvious anyway, because of visible data corruption.

Exactly, and that awareness will allow a user to prevent further data
loss or corruption. Slow corruption over time is a very much known and
accepted real-world case that people do experience, as well as bit
flipping enough for someone to write a not-that-old blog post about
them:

https://blogs.oracle.com/ksplice/entry/attack_of_the_cosmic_rays1

A really nice property of checksums on pages is that they also tell you
what data you *didn't* lose, which can be extremely valuable.

> I think the only really clear benefit accruing from checksums is that
> they make it easier to distinguish storage-subsystem failures from
> Postgres bugs. That can certainly be a benefit to some users, but
> I remain dubious that the average user will find it worth any noticeable
> amount of overhead.

Or memory errors, or kernel bugs, or virtualization bugs, if they happen
at the right time. We keep adding to the bits between the DB and the
storage and to think they're all perfect is certainly a step farther
than I'd go.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-01-21 18:39:21 Re: Checksums by default?
Previous Message Tom Lane 2017-01-21 18:35:46 Re: Checksums by default?