Re: Online enabling of checksums

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Sergei Kornilov <sk(at)zsrv(dot)org>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Online enabling of checksums
Date: 2018-09-29 15:58:39
Message-ID: 9193af87-77af-9c29-ac15-03364a2eec1b@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/29/2018 02:19 PM, Stephen Frost wrote:
> Greetings,
>
> * Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
>> While looking at the online checksum verification patch (which I guess
>> will get committed before this one), it occurred to me that disabling
>> checksums may need to be more elaborate, to protect against someone
>> using the stale flag value (instead of simply switching to "off"
>> assuming that's fine).
>>
>> The signals etc. seem good enough for our internal stuff, but what if
>> someone uses the flag in a different way? E.g. the online checksum
>> verification runs as an independent process (i.e. not a backend) and
>> reads the control file to find out if the checksums are enabled or not.
>> So if we just switch from "on" to "off" that will break.
>>
>> Of course, we may also say "Don't disable checksums while online
>> verification is running!" but that's not ideal.
>
> I'm not really sure what else we could say here..? I don't particularly
> see an issue with telling people that if they disable checksums while
> they're running a tool that's checking the checksums that they're going
> to get odd results.
>

I don't know, to be honest. I was merely looking at the online
verification patch and realized that if someone disables checksums it
won't notice it (because it only reads the flag once, at the very
beginning) and will likely produce bogus errors.

Although, maybe it won't - it now uses a checkpoint LSN, so that might
fix it. The checkpoint LSN is read from the same controlfile as the
flag, so we know the checksums were enabled during that checkpoint. Soi
if we ignore failures with a newer LSN, that should do the trick, no?

So perhaps that's the right "protocol" to handle this?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2018-09-29 16:09:02 Re: Cygwin linking rules
Previous Message David Fetter 2018-09-29 15:56:57 Re: Adding pipe support to pg_dump and pg_restore