Re: [DESIGN] Incremental checksums

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: David Christensen <david(at)endpoint(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: [DESIGN] Incremental checksums
Date: 2015-07-15 07:18:40
Message-ID: CAA4eK1+bVuact-dhUv1igSXe2E03QV+xw-j_ReO6FLssjd0bvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 14, 2015 at 1:56 AM, David Christensen <david(at)endpoint(dot)com>
wrote:

>
> For any relation that it finds in the database which is not checksummed,
it starts an actual worker to handle the checksum process for this table.
Since the state of the cluster is already either "enforcing" or
"revalidating", any block writes will get checksums added automatically, so
the only thing the bgworker needs to do is load each block in the relation
and explicitly mark as dirty (unless that's not required for FlushBuffer()
to do its thing). After every block in the relation is visited this way
and checksummed, its pg_class record will have "rellastchecksum" updated.
>

If during scan of a relation, after doing checksum for half of the
blocks in relation, system crashes, then in the above scheme a
restart would need to again read all the blocks even though some
of the blocks are already checksummed in previous cycle, this is
okay if it happens for few small or medium size relations, but assume
it happens when multiple large size relations are at same state
(half blocks are checksummed) when the crash occurs, then it could
lead to much more IO than required.

> ** Function API:
>
> Interface to the functionality will be via the following Utility
functions:
>
> - pg_enable_checksums(void) => turn checksums on for a cluster. Will
error if the state is anything but "disabled". If this is the first time
this cluster has run this, this will initialize
ControlFile->data_checksum_version to the preferred built-in algorithm
(since there's only one currently, we just set it to 1). This increments
the ControlFile->data_checksum_cycle variable, then sets the state to
"enabling", which means that the next time the bgworker checks if there is
anything to do it will see that state, scan all the databases'
"datlastchecksum" fields, and start kicking off the bgworker processes to
handle the checksumming of the actual relation files.
>
> - pg_disable_checksums(void) => turn checksums off for a cluster. Sets
the state to "disabled", which means bg_worker will not do anything.
>
> - pg_request_checksum_cycle(void) => if checksums are "enabled",
increment the data_checksum_cycle counter and set the state to "enabling".
>

If the cluster is already enabled for checksums, then what is
the need for any other action?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-07-15 07:23:10 Re: Could be improved point of UPSERT
Previous Message Kouhei Kaigai 2015-07-15 07:08:01 Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)