Re: Online verification of checksums

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Online verification of checksums
Date: 2019-03-29 21:08:30
Message-ID: 1553893710.4884.62.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Am Freitag, den 29.03.2019, 16:52 +0100 schrieb Magnus Hagander:
> On Fri, Mar 29, 2019 at 4:30 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
> > > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
> > > wrote:
> > > > On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
> > > > >On 2019-03-28 21:09:22 +0100, Michael Banck wrote:
> > > > >> I agree that the current patch might have some corner-cases where it
> > > > >> does not guarantee 100% accuracy in online mode, but I hope the current
> > > > >> version at least has no more false negatives.
> > > > >
> > > > >False positives are *bad*. We shouldn't integrate code that has them.
> > > >
> > > > Yeah, I agree. I'm a bit puzzled by the reluctance to make the online mode
> > > > communicate with the server, which would presumably address these issues.
> > > > Can someone explain why not to do that?
> > >
> > > I agree that this effort seems better spent on fixing those issues there
> > > (of which many are the same), and then re-use that.
> >
> > This really seems like it depends on which of the options we're talking
> > about..   Connecting to the server and asking what the current insert
> > point is, so we can check that the LSN isn't completely insane, seems
> > reasonable, but at least one option being discussed was to have
> > pg_basebackup actually *lock the page* (even if just for I/O..) and then
> > re-read it, and having an external tool doing that instead of the
> > backend seems like a whole different level to me.  That would involve
> > having an SQL function for "lock this page against I/O" and then another
> > for "unlock this page", wouldn't it?
>
> Right.
>
> But what if we just added a flag to the BASE_BACKUP command in the
> replication protocol that said "meh, I really just want to verify the
> checksums, so please send the data to devnull and only feed me regular
> status updates on this connection"?

I don't know whether BASE_BACKUP is the best interface for that (at
least right now) - backend/replication/basebackup.c's sendFile() gets
only an absolute filename to send, which is not adequate for more in-
depth server-based things like locking a particular page in a particular
relation of some particular tablespace.

ISTM that the fact that we had to teach it about different segment files
for checksum verification by splitting up the filename at "." implies
that it is not the correct level of abstraction (but maybe it could get
schooled some more about Postgres internals, e.g. by passing it a
RefFileNode struct and not a filename).

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-03-29 21:42:02 Re: REINDEX CONCURRENTLY 2.0
Previous Message Tomas Vondra 2019-03-29 21:07:17 Re: explain plans with information about (modified) gucs