Online enabling of page level checksums

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Online enabling of page level checksums
Date: 2017-01-22 11:13:38
Message-ID: CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp=-7OJWBbcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So, that post I made about checksums certainly spurred a lot of discussion
:) One summary is that life would be a lot easier if we could turn
checksums on (and off) without re-initdbing. I'm breaking out this
question into this thread to talk about it separately.

I've been toying with a branch to work on this, but haven't had a time to
get it even to compiling state. But instead of waiting until I have some
code to show, let me outline the idea I had.

My general idea is this:

Take one bit in the checksum version field and make it mean "in progress".
That means chat checksums can now be "on", "off", or "in progress".

When checksums are "in progress", PostgreSQL will compute and write
checksums whenever it writes out a buffer, but it will *not* verify
checksums on read.

This state would be set by calling a function (or an external command with
the system shut down if need be - I can accept a restart for this, but I'd
rather avoid it if possible).

This function would also launch a background worker. This worker would
enumerate the entire database block by block. Read a block, verify if the
checksum is set and correct. If it is, ignore it (because any further
updates will keep it in state ok when we're in state "in progress"). If not
then mark it as dirty and write it out through regular means, which will
include computing and writing the checksum since we're "in progress". With
something similar to vacuum cost delay to control how quickly it writes.

Yes, this means the entire db will end up in the transaction log since
everything is rewritten. That's not great, but for a lot of people that
will be a trade they're willing to make since it's a one-time thing. Yes,
this background process might take days or weeks - that's OK as long as it
happens online.

Once the background worker is done, it flips the checksum state to "on",
and the system starts verifying checksums as well.

If the system is interrupted before the background worker is done, it
starts over from the beginning. Previously touched blocks will be read and
verified, but not written (because their checksum is already correct). This
will take time, but not re-generate the WAL.

I think the actual functions and background worker could go in an extension
that's installed and loaded only by those who need it. But the core
functionality of being able to have "checksum in progress" would have to be
in the core codebase.

So, is there something obviously missing in this plan? Or just the code to
do it :)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-01-22 15:45:49 Re: new autovacuum criterion for visible pages
Previous Message Magnus Hagander 2017-01-22 11:06:24 Re: Checksums by default?