Re: Add checksums without --initdb

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: David Christensen <david(at)endpoint(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add checksums without --initdb
Date: 2015-07-02 19:53:40
Message-ID: 559596C4.50103@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/02/2015 10:39 PM, David Christensen wrote:
> Possible concerns here are whether checksums are included in WAL
> full_page_writes or if they are independently calculated; if the
> latter I think we’d be fine. If checksums are all handled at the
> layer below WAL than any streamed/processed changes should be fine to
> get us to the point where we could come up as a master.

It's not full_page_writes that's the problem, but the server would not
WAL-log hint bit updates, unless you also have wal_log_hints enabled.
But that would be simple to just check - wal_log_hints can be enabled
with a server restart so that's not too onerous.

> Andres suggested a separate tool that would basically rewrite the
> existing data directory heap files in place, which I can also see a
> use case for, but I also think there’s some benefit to be found in
> having it happen while the replica is being streamed/built.
>
> Ideas/thoughts/reasons this wouldn’t work?

You probably could make this work, but it seems like a pretty
complicated way to enable checksums. There's also interesting
corner-cases with replication; is it possible to connect a streaming
replica that's been restored from the checksums-enabled backup to a
checksums-disabled master. The enable-in-place approach seems a lot more
straightforward to me. In a nutshell:

Add a "enabling-checksums" mode to the server where it calculates
checksums for anything it writes, but doesn't check or complain about
incorrect checksums on reads. Put the server into that mode, and then
have a background process that reads through all data in the cluster,
calculates the checksum for every page, and writes all the data back.
Once that's completed, checksums can be fully enabled.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-07-02 20:06:11 Re: Information of pg_stat_ssl visible to all users
Previous Message Alvaro Herrera 2015-07-02 19:52:01 Re: Information of pg_stat_ssl visible to all users