Re: Online verification of checksums

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Online verification of checksums
Date: 2019-03-18 07:18:18
Message-ID: 1552893498.9697.30.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Am Montag, den 18.03.2019, 02:38 -0400 schrieb Stephen Frost:
> * Michael Paquier (michael(at)paquier(dot)xyz) wrote:
> > On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Frost wrote:
> > > To be clear, I agree completely that we don't want to be reporting false
> > > positives or "this might mean corruption!" to users running the tool,
> > > but I haven't seen a good explaination of why this needs to involve the
> > > server to avoid that happening. If someone would like to point that out
> > > to me, I'd be happy to go read about it and try to understand.
> >
> > The mentions on this thread that the server has all the facility in
> > place to properly lock a buffer and make sure that a partial read
> > *never* happens and that we *never* have any kind of false positives,
>
> Uh, we are, of course, going to have partial reads- we just need to
> handle them appropriately, and that's not hard to do in a way that we
> never have false positives.

I think the current patch (V13 from https://www.postgresql.org/message-i
d/1552045881(dot)4947(dot)43(dot)camel(at)credativ(dot)de) does that, modulo possible bugs.

> I do not understand, at all, the whole sub-thread argument that we have
> to avoid partial reads. We certainly don't worry about that when doing
> backups, and I don't see why we need to avoid it here. We are going to
> have partial reads- and that's ok, as long as it's because we're at the
> end of the file, and that's easy enough to check by just doing another
> read to see if we get back zero bytes, which indicates we're at the end
> of the file, and then we move on, no need to coordinate anything with
> the backend for this.

Well, I agree with you, but we don't seem to have consensus on that.

> > directly preventing the set of issues we are trying to implement
> > workarounds for in a frontend tool are rather good arguments in my
> > opinion (you can grep for BufferDescriptorGetIOLock() on this thread
> > for example).
>
> Sure the backend has those facilities since it needs to, but these
> frontend tools *don't* need that to *never* have any false positives, so
> why are we complicating things by saying that this frontend tool and the
> backend have to coordinate?
>
> If there's an explanation of why we can't avoid having false positives
> in the frontend tool, I've yet to see it. I definitely understand that
> we can get partial reads, but a partial read isn't a failure, and
> shouldn't be reported as such.

It is not in the current patch, it should just get reported as a skipped
block in the end. If the cluster is online that is, if it is offline,
we do consider it a failure.

I have now rebased that patch on top of the pg_verify_checksums ->
pg_checksums renaming, see attached.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment Content-Type Size
online-verification-of-checksums_V14.patch text/x-patch 10.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matsumura, Ryo 2019-03-18 07:31:51 RE: SQL statement PREPARE does not work in ECPG
Previous Message Mithun Cy 2019-03-18 07:04:18 Re: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x