Re: Online verification of checksums

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Online verification of checksums
Date: 2019-03-01 00:05:14
Message-ID: 1551398714.4947.28.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Am Donnerstag, den 28.02.2019, 14:29 +0100 schrieb Fabien COELHO:
> > So I have now changed behaviour so that short writes count as skipped
> > files and pg_verify_checksums no longer bails out on them. When this
> > occors a warning is written to stderr and their overall count is also
> > reported at the end. However, unless there are other blocks with bad
> > checksums, the exit status is kept at zero.
>
> This seems fair when online, however I'm wondering whether it is when
> offline. I'd say that the whole retry logic should be skipped in this
> case? i.e. "if (block_retry || !online) { error message and continue }"
> on both short read & checksum failure retries.

Ok, the stand-alone pg_checksums program also got a PR about the LSN
skip logic not being helpful when the instance is offline and somebody
just writes /dev/urandom over the heap files: 

https://github.com/credativ/pg_checksums/pull/6

So I now tried to change the patch so that it only retries blocks when
online.

> Patch applies cleanly, compiles, global & local make check ok.
>
> I'm wondering whether it should exit(1) on "lseek" failures. Would it make
> sense to skip the file and report it as such? Should it be counted as a
> skippedfile?

Ok, I think it makes sense to march on and I changed it that way.

> WRT the final status, ISTM that slippedblocks & files could warrant an
> error when offline, although they might be ok when online?

Ok, also changed it that way.

New patch attached.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment Content-Type Size
online-verification-of-checksums_V11.patch text/x-patch 9.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-03-01 00:22:04 Re: pg_partition_tree crashes for a non-defined relation
Previous Message Peter Geoghegan 2019-02-28 23:57:12 Re: Why don't we have a small reserved OID range for patch revisions?