From: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Online verification of checksums |
Date: | 2020-04-06 21:15:17 |
Message-ID: | dfeee5471ea96cf30f720177b12ec9053a954598.camel@credativ.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Am Montag, den 06.04.2020, 16:45 -0400 schrieb Tom Lane:
> I wrote:
> > Another thing that's bothering me is that the patch compares page LSN
> > against GetInsertRecPtr(); but that function says
> > ...
> > I'm not convinced that an approximation is good enough here. It seems
> > like a page that's just now been updated could have an LSN beyond the
> > current XLOG page start, potentially leading to a false checksum
> > complaint. Maybe we could address that by adding one xlog page to
> > the GetInsertRecPtr result? Kind of a hack, but ...
I was about to write that it sounds like a pragmatic solution to me,
but...
> Actually, after thinking about that a bit more: why is there an LSN-based
> special condition at all? It seems like it'd be far more useful to
> checksum everything, and on failure try to re-read and re-verify the page
> once or twice, so as to handle the corner case where we examine a page
> that's in process of being overwritten.
Andres outlined something about a year ago which on re-reading sounds
similar to what you suggest above in
20190326170820(dot)6sylklg7eh6uhabd(at)alap3(dot)anarazel(dot)de but never posted a
full patch. He seems to have had a few additional checks from PageIsVerified() in mind, though.
The original check against the checkpoint LSN wasn't suggested by me;
I've submitted this patch with the InsertRecPtr as an upper bound as a
*(presumably) minimal-invasive patch which could be back-patched (when
nothing came of the above thread for a while), but the issue seems to be
quite a bit nuanced.
Probably we need to take a step back; the question is whether something
like what Andres suggested should/could be coded up for v13 still
(before the feature freeze) and if so, by whom (I won't have the time),
or whether it would still qualify as a back-patchable bug-fix and/or
whether your suggestion above would.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From | Date | Subject | |
---|---|---|---|
Next Message | James Coleman | 2020-04-06 21:20:53 | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Previous Message | Andres Freund | 2020-04-06 21:14:01 | Re: SyncRepLock acquired exclusively in default configuration |