Re: 16-bit page checksums for 9.2

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, david(at)fetter(dot)org, aidan(at)highrise(dot)ca, stark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 16-bit page checksums for 9.2
Date: 2012-02-20 23:23:42
Message-ID: CA+U5nM+AJO+ShwaoCY9O-XRWbVQxb4ebDd7+n8nTHK2t5e5FHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 20, 2012 at 11:09 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Sun, Feb 19, 2012 at 05:04:06PM -0500, Robert Haas wrote:
>> Another disadvantage of the current scheme is that there's no
>> particularly easy way to know that your whole cluster has checksums.
>> No matter how we implement checksums, you'll have to rewrite every
>> table in the cluster in order to get them fully turned on.  But with
>> the current design, there's no easy way to know how much of the
>> cluster is actually checksummed.  If you shut checksums off, they'll
>> linger until those pages are rewritten, and there's no easy way to
>> find the relations from which they need to be removed, either.
>
> Yes, pg_upgrade makes this hard.  If you are using pg_dump to restore,
> and set the checksum GUC before you do the restore, and never turn it
> off, then you will have a fully checksum'ed database.
>
> If you use pg_upgrade, and enable the checksum GUC, your database will
> become progressively checksummed, and as Simon pointed out, the only
> clean way is VACUUM FULL.  It is quite hard to estimate the checksum
> coverage of a database with mixed checksumming --- one cool idea would
> be for ANALYZE to report how many of the pages it saw were checksummed.
> Yeah, crazy, but it might be enough.

Well, I didn't say VACUUM FULL was the only clean way of knowing
whether every block is checksummed, its a very intrusive way.

If you want a fast upgrade with pg_upgrade, rewriting every block is
not really a grand plan, but if you want it...

If we did that, I think I would prefer to do it with these commands

VACUUM ENABLE CHECKSUM; //whole database only
VACUUM DISABLE CHECKSUM;

rather than use a GUC. We can then add an option to pg_upgrade.

That way, we scan whole database, adding checksums and then record it
in pg_database

When we remove it, we do same thing in reverse then record it.

So there's no worries about turning on/off GUCs and we know for
certain where our towel is.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Boguk 2012-02-21 00:10:09 Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
Previous Message Peter Geoghegan 2012-02-20 23:16:04 Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)