Re: Online checksums verification in the backend

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-11-03 09:46:12
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 02, 2020 at 11:34:57AM -0800, Andres Freund wrote:
> On 2020-11-02 12:35:30 -0500, Robert Haas wrote:
>> On Thu, Oct 29, 2020 at 2:17 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>> I think this needs to be quickly reworked or reverted.
> I think it's fairly clear by now that revert is appropriate for now.

Yep, that's clear. I'll deal with that tomorrow. That's more than a
simple rework.

>> I don't like this patch, either. In addition to what Andres mentioned,
>> CheckBuffer() is a completely-special case mechanism which can't be
>> reused by anything else. In particular, the amcheck for heap stuff
>> which I recently committed (866e24d47db1743dfcff5bd595b57e3a143f2cb1)
>> would really like a way to examine a buffer without risking an error
>> if PageIsVerified() should happen to fail, but this patch is of
>> absolutely no use in getting there, because CheckBuffer() doesn't give
>> the caller any way to access the contents of the buffer. It can only
>> do the checks that it knows how to do, and that's it. That doesn't
>> seem like a good design.
> Wouldn't this be better served by having a ReadBufferExtended() flag,
> preventing erroring out and zeroing the buffer? I'm not sure that
> handling both the case where the buffer contents need to be valid and
> the one where it doesn't will make for a good API.

If you grep for ReadBuffer_common() is some of the emails I sent.. I
was rather interested in something like that.

>> I don't like the fact that CheckBuffer() silently skips dirty buffers,
>> either. The comment should really say that it checks the state of a
>> buffer without loading it into shared buffers, except sometimes it
>> doesn't actually check it.
> Yea, I don't see a good reason for that either. There's reasons for
> dirty buffers that aren't WAL logged - so if the on-disk page is broken,
> a standby taken outside pg_basebackup would possibly still end up with a
> corrupt on-disk page. Similar with a crash restart.

Er, if you don't skip dirty buffers, wouldn't you actually report some
pages as broken if attempting to run those in a standby who may have
some torn pages from a previous base backup? You could still run into
problems post-promotion, until the first checkpoint post-recovery
happens, no?

>> If the buffer is in shared buffers, we could take a share-lock
>> on the buffer and copy the contents of the page as it exists on disk,
>> and then still check it.
> Don't think we need a share lock. That still allows the buffer to be
> written out (and thus a torn read). What we need is to set
> BM_IO_IN_PROGRESS on the buffer in question - only one backend can set
> that. And then unset that again, without unsetting

If that can work, we could make use of some of that for base backups
for a single retry of a page that initially failed.

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-11-03 09:47:14 Re: [PATCH] remove deprecated v8.2 containment operators
Previous Message Magnus Hagander 2020-11-03 09:38:39 Re: -O switch