Online enabling of checksums

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Online enabling of checksums
Date: 2018-02-21 20:53:31
Message-ID: CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

*Once more, here is an attempt to solve the problem of on-line enabling of
checksums that me and Daniel have been hacking on for a bit. See for
example
https://www.postgresql.org/message-id/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com
<https://www.postgresql.org/message-id/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com>
and
https://www.postgresql.org/message-id/flat/FF393672-5608-46D6-9224-6620EC532693%40endpoint(dot)com#FF393672-5608-46D6-9224-6620EC532693(at)endpoint(dot)com
<https://www.postgresql.org/message-id/flat/FF393672-5608-46D6-9224-6620EC532693%40endpoint(dot)com#FF393672-5608-46D6-9224-6620EC532693(at)endpoint(dot)com>
for some previous discussions.Base design:Change the checksum flag to
instead of on and off be an enum. off/inprogress/on. When checksums are off
and on, they work like today. When checksums are in progress, checksums are
*written* but not verified. State can go from “off” to “inprogress”, from
“inprogress” to either “on” or “off”, or from “on” to “off”.Two new
functions are added, pg_enable_data_checksums() and
pg_disable_data_checksums(). The disable one is easy -- it just changes to
disable. The enable one will change the state to inprogress, and then start
a background worker (the “checksumhelper launcher”). This worker in turn
will start one sub-worker (“checksumhelper worker”) in each database
(currently all done sequentially). This worker will enumerate all
tables/indexes/etc in the database and validate their checksums. If there
is no checksum, or the checksum is incorrect, it will compute a new
checksum and write it out. When all databases have been processed, the
checksum state changes to “on” and the launcher shuts down. At this point,
the cluster has checksums enabled as if it was initdb’d with checksums
turned on.If the cluster shuts down while “inprogress”, the DBA will have
to manually either restart the worker (by calling pg_enable_checksums()) or
turn checksums off again. Checksums “in progress” only carries a cost and
no benefit.The change of the checksum state is WAL logged with a new xlog
record. All the buffers written by the background worker are forcibly
enabled full page writes to make sure the checksum is fully updated on the
standby even if no actual contents of the buffer changed.We’ve also
included a small commandline tool, bin/pg_verify_checksums, that can be run
against an offline cluster to validate all checksums. Future improvements
includes being able to use the background worker/launcher to perform an
online check as well. Being able to run more parallel workers in the
checksumhelper might also be of interest.The patch includes two sets of
tests, an isolation test turning on checksums while one session is writing
to the cluster and another is continuously reading, to simulate turning on
checksums in a production database. There is also a TAP test which enables
checksums with streaming replication turned on to test the new xlog record.
The isolation test ran into the 1024 character limit of the isolation test
lexer, with a separate patch and discussion at
https://www.postgresql.org/message-id/8D628BE4-6606-4FF6-A3FF-8B2B0E9B43D0@yesql.se
<https://www.postgresql.org/message-id/8D628BE4-6606-4FF6-A3FF-8B2B0E9B43D0@yesql.se>*

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

Attachment Content-Type Size
online_checksums.patch text/x-patch 68.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2018-02-21 20:57:42 Re: Two small patches for the isolationtester lexer
Previous Message Peter Eisentraut 2018-02-21 20:45:17 Re: SHA-2 functions