Re: Enabling Checksums

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Enabling Checksums
Date: 2012-11-12 17:52:25
Message-ID: 1352742745.3113.85.camel@jdavis-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2012-11-11 at 23:55 -0500, Greg Smith wrote:
> Adding an initdb option to start out with everything checksummed seems
> an uncontroversial good first thing to have available.

OK, so here's my proposal for a first patch (changes from Simon's
patch):

* Add a flag to the postgres executable indicating that it should use
checksums on everything. This would only be valid if bootstrap mode is
also specified.
* Add a multi-state checksums flag in pg_control, that would have
three states: OFF, ENABLING, and ON. It would only be set to ON during
bootstrap, and in this first patch, it would not be possible to set
ENABLING.
* Remove GUC and use this checksums flag everywhere.
* Use the TLI field rather than the version field of the page header.
* Incorporate page number into checksum calculation (already done).

Does this satisfy the requirements for a first step? Does it interfere
with potential future work?

> Won't a pg_checksums program just grow until it looks like a limited
> version of vacuum though?

We can dig into the details of that later, but I don't think it's
useless, even if we do have per-table (or better) checksums. For
instance, it would be useful to verify backups offline.

I think it's a legitimate concern that we might reinvent some VACUUM
machinery. Ideally, we'd get better online migration tools for checksums
(perhaps using VACUUM) fast enough that nobody will bother introducing
that kind of bloat into pg_checksums.

> I think it's useful to step back for a minute and consider the larger
> uncertainty an existing relation has, which amplifies just how ugly this
> situation is. The best guarantee I think online checksumming can offer
> is to tell the user "after transaction id X, all new data in relation R
> is known to be checksummed".

It's slightly better than that. It's more like: "we can tell you if any
of your data gets corrupted after transaction X". If old data is
corrupted before transaction X, then there's nothing we can do. But if
it's corrupted after transaction X (even if it's old data), the
checksums should catch it.

> Unless you do this at initdb time, any
> conversion case is going to have the possibility that a page is
> corrupted before you get to it--whether you're adding the checksum as
> part of a "let's add them while we're writing anyway" page update or the
> conversion tool is hitting it.

Good point.

> That's why I don't think anyone will find online conversion really
> useful until they've done a full sweep updating the old pages.

I don't entirely agree. A lot of times, you just want to know whether
your disk is changing your data out from under you. Maybe you miss some
cases and maybe not all of your data is protected, but just knowing
which disks need to be replaced, and which RAID controllers not to buy
again, is quite valuable. And the more data you get checksummed the
faster you'll find out.

> One of the really common cases I was expecting here is that conversions
> are done by kicking off a slow background VACUUM CHECKSUM job that might
> run in pieces.

Right now I'm focused on the initial patch and other fairly immediate
goals, so I won't address this now. But I don't want to cut off the
conversation, either.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-11-12 17:55:47 Re: [PATCH] Patch to compute Max LSN of Data Pages
Previous Message Karl O. Pinc 2012-11-12 17:45:44 Re: Suggestion for --truncate-tables to pg_restore