Re: Online verification of checksums

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Online verification of checksums
Date: 2020-04-06 21:15:17
Message-ID: dfeee5471ea96cf30f720177b12ec9053a954598.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Am Montag, den 06.04.2020, 16:45 -0400 schrieb Tom Lane:
> I wrote:
> > Another thing that's bothering me is that the patch compares page LSN
> > against GetInsertRecPtr(); but that function says
> > ...
> > I'm not convinced that an approximation is good enough here. It seems
> > like a page that's just now been updated could have an LSN beyond the
> > current XLOG page start, potentially leading to a false checksum
> > complaint. Maybe we could address that by adding one xlog page to
> > the GetInsertRecPtr result? Kind of a hack, but ...

I was about to write that it sounds like a pragmatic solution to me,
but...

> Actually, after thinking about that a bit more: why is there an LSN-based
> special condition at all? It seems like it'd be far more useful to
> checksum everything, and on failure try to re-read and re-verify the page
> once or twice, so as to handle the corner case where we examine a page
> that's in process of being overwritten.

Andres outlined something about a year ago which on re-reading sounds
similar to what you suggest above in
20190326170820(dot)6sylklg7eh6uhabd(at)alap3(dot)anarazel(dot)de but never posted a
full patch. He seems to have had a few additional checks from PageIsVerified() in mind, though.

The original check against the checkpoint LSN wasn't suggested by me;
I've submitted this patch with the InsertRecPtr as an upper bound as a
*(presumably) minimal-invasive patch which could be back-patched (when
nothing came of the above thread for a while), but the issue seems to be
quite a bit nuanced.

Probably we need to take a step back; the question is whether something
like what Andres suggested should/could be coded up for v13 still
(before the feature freeze) and if so, by whom (I won't have the time),
or whether it would still qualify as a back-patchable bug-fix and/or
whether your suggestion above would.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-04-06 21:20:53 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Andres Freund 2020-04-06 21:14:01 Re: SyncRepLock acquired exclusively in default configuration