Re: Online verification of checksums

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Online verification of checksums
Date: 2019-03-30 11:56:21
Message-ID: CABUevEy23u6-s_-q4m7Yz3QZwbAKFespAvqY4Gq7DvJGhFheVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 29, 2019 at 10:08 PM Michael Banck <michael(dot)banck(at)credativ(dot)de>
wrote:

> Hi,
>
> Am Freitag, den 29.03.2019, 16:52 +0100 schrieb Magnus Hagander:
> > On Fri, Mar 29, 2019 at 4:30 PM Stephen Frost <sfrost(at)snowman(dot)net>
> wrote:
> > > * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
> > > > On Thu, Mar 28, 2019 at 10:19 PM Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com>
> > > > wrote:
> > > > > On Thu, Mar 28, 2019 at 01:11:40PM -0700, Andres Freund wrote:
> > > > > >On 2019-03-28 21:09:22 +0100, Michael Banck wrote:
> > > > > >> I agree that the current patch might have some corner-cases
> where it
> > > > > >> does not guarantee 100% accuracy in online mode, but I hope the
> current
> > > > > >> version at least has no more false negatives.
> > > > > >
> > > > > >False positives are *bad*. We shouldn't integrate code that has
> them.
> > > > >
> > > > > Yeah, I agree. I'm a bit puzzled by the reluctance to make the
> online mode
> > > > > communicate with the server, which would presumably address these
> issues.
> > > > > Can someone explain why not to do that?
> > > >
> > > > I agree that this effort seems better spent on fixing those issues
> there
> > > > (of which many are the same), and then re-use that.
> > >
> > > This really seems like it depends on which of the options we're talking
> > > about.. Connecting to the server and asking what the current insert
> > > point is, so we can check that the LSN isn't completely insane, seems
> > > reasonable, but at least one option being discussed was to have
> > > pg_basebackup actually *lock the page* (even if just for I/O..) and
> then
> > > re-read it, and having an external tool doing that instead of the
> > > backend seems like a whole different level to me. That would involve
> > > having an SQL function for "lock this page against I/O" and then
> another
> > > for "unlock this page", wouldn't it?
> >
> > Right.
> >
> > But what if we just added a flag to the BASE_BACKUP command in the
> > replication protocol that said "meh, I really just want to verify the
> > checksums, so please send the data to devnull and only feed me regular
> > status updates on this connection"?
>
> I don't know whether BASE_BACKUP is the best interface for that (at
> least right now) - backend/replication/basebackup.c's sendFile() gets
> only an absolute filename to send, which is not adequate for more in-
> depth server-based things like locking a particular page in a particular
> relation of some particular tablespace.

> ISTM that the fact that we had to teach it about different segment files
> for checksum verification by splitting up the filename at "." implies
> that it is not the correct level of abstraction (but maybe it could get
> schooled some more about Postgres internals, e.g. by passing it a
> RefFileNode struct and not a filename).
>

But that has to be fixed in pg_basebackup *regardless*, doesn't it? And if
we fix it there, we only have to fix it once...

//Magnus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2019-03-30 12:16:21 Re: pgsql: Improve autovacuum logging for aggressive and anti-wraparound ru
Previous Message Peter Eisentraut 2019-03-30 11:27:47 Re: PostgreSQL pollutes the file system