Quick Links

Re: Online verification of checksums

From:	Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To:	Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc:	David Steele <david(at)pgmasters(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject:	Re: Online verification of checksums
Date:	2018-09-26 15:14:02
Message-ID:	alpine.DEB.2.21.1809261703520.22248@lancre
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>> The patch is missing a documentation update.
>
> I've added that now. I think the only change needed was removing the
> "server needs to be offline" part?

Yes, and also checking that the described behavior correspond to the new
version.

>> There are debatable changes of behavior:
>>
>> if (errno == ENOENT) return / continue...
>>
>> For instance, a file disappearing is ok online, but not so if offline. On
>> the other hand, the probability that a file suddenly disappears while the
>> server offline looks remote, so reporting such issues does not seem
>> useful.
>>
>> However I'm more wary with other continues/skips added. ISTM that skipping
>> a block because of a read error, or because it is new, or some other
>> reasons, is not the same thing, so should be counted & reported
>> differently?
>
> I think that would complicate things further without a lot of benefit.
>
> After all, we are interested in checksum failures, not necessarily read
> failures etc. so exiting on them (and skip checking possibly large parts
> of PGDATA) looks undesirable to me.

Hmmm.

I'm really saying that it is debatable, so here is some fuel to the
debate:

If I run the check command and it cannot do its job, there is a problem
which is as bad as a failing checksum. The only safe assumption on a
cannot-read block is that the checksum is bad... So ISTM that on
on some of the "skipped" errors there should be appropriate report (exit
code, final output) that something is amiss.

--
Fabien.

In response to

Re: Online verification of checksums at 2018-09-26 14:37:18 from Michael Banck

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Banck	2018-09-26 15:15:27	Re: Online verification of checksums
Previous Message	Tom Lane	2018-09-26 15:09:59	Re: Allowing printf("%m") only where it actually works