Re: Online verification of checksums

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Online verification of checksums
Date: 2019-03-18 06:38:10
Message-ID: 20190318063810.GI6197@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Michael Paquier (michael(at)paquier(dot)xyz) wrote:
> On Mon, Mar 18, 2019 at 01:43:08AM -0400, Stephen Frost wrote:
> > To be clear, I agree completely that we don't want to be reporting false
> > positives or "this might mean corruption!" to users running the tool,
> > but I haven't seen a good explaination of why this needs to involve the
> > server to avoid that happening. If someone would like to point that out
> > to me, I'd be happy to go read about it and try to understand.
>
> The mentions on this thread that the server has all the facility in
> place to properly lock a buffer and make sure that a partial read
> *never* happens and that we *never* have any kind of false positives,

Uh, we are, of course, going to have partial reads- we just need to
handle them appropriately, and that's not hard to do in a way that we
never have false positives.

I do not understand, at all, the whole sub-thread argument that we have
to avoid partial reads. We certainly don't worry about that when doing
backups, and I don't see why we need to avoid it here. We are going to
have partial reads- and that's ok, as long as it's because we're at the
end of the file, and that's easy enough to check by just doing another
read to see if we get back zero bytes, which indicates we're at the end
of the file, and then we move on, no need to coordinate anything with
the backend for this.

> directly preventing the set of issues we are trying to implement
> workarounds for in a frontend tool are rather good arguments in my
> opinion (you can grep for BufferDescriptorGetIOLock() on this thread
> for example).

Sure the backend has those facilities since it needs to, but these
frontend tools *don't* need that to *never* have any false positives, so
why are we complicating things by saying that this frontend tool and the
backend have to coordinate?

If there's an explanation of why we can't avoid having false positives
in the frontend tool, I've yet to see it. I definitely understand that
we can get partial reads, but a partial read isn't a failure, and
shouldn't be reported as such.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mithun Cy 2019-03-18 07:04:18 Re: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x
Previous Message Stephen Frost 2019-03-18 06:32:20 Re: Data-only pg_rewind, take 2