Re: Online verification of checksums

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Online verification of checksums
Date: 2019-03-08 11:51:21
Message-ID: 1552045881.4947.43.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

Am Sonntag, den 03.03.2019, 11:51 +0100 schrieb Michael Banck:
> Am Samstag, den 02.03.2019, 11:08 -0500 schrieb Stephen Frost:
> > I'm not necessairly against skipping to the next file, to be clear,
> > but I think I'd be happier if we kept reading the file until we
> > actually get EOF.
>
> So if we read half a block twice we should seek() to the next block and
> continue till EOF, ok. I think in most cases those pages will be new
> anyway and there will be no checksum check, but it sounds like a cleaner
> approach. I've seen one or two examples where we did successfully verify
> the checksum of a page after a half-read, so it might be worth it.

I've done that now, i.e. it seeks to the next block and continues to
read there (possibly getting an EOF).

I don't issue a warning for this skipped block anymore as it is somewhat
to be expected that we see some half-reads. If the seek fails for some
reason, that is still a warning.

> I still think that an external checksum verification tool has some
> merit, given that basebackup does it and the current offline requirement
> is really not useful in practise.

I've read the rest of the thread, and it seems several people prefer a
solution that interacts with the server. I won't be able to work on that
for v12 and I guess it would be too late in the cycle anyway.

I thought about I/O throttling in online mode, but it seems to be most
easily tied in with the progress reporting (that already keeps track of
everything or most of what we'd need), so I will work on it in that
context.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment Content-Type Size
online-verification-of-checksums_V13.patch text/x-patch 10.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-03-08 12:20:30 Re: PostgreSQL vs SQL/XML Standards
Previous Message Amit Kapila 2019-03-08 11:49:06 Re: WIP: Avoid creation of the free space map for small tables