Re: Online checksums verification in the backend

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-11-02 19:34:57
Message-ID: 20201102193457.fc2hoen7ahth4bbc@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-11-02 12:35:30 -0500, Robert Haas wrote:
> On Thu, Oct 29, 2020 at 2:17 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I think this needs to be quickly reworked or reverted.

I think it's fairly clear by now that revert is appropriate for now.

> I don't like this patch, either. In addition to what Andres mentioned,
> CheckBuffer() is a completely-special case mechanism which can't be
> reused by anything else. In particular, the amcheck for heap stuff
> which I recently committed (866e24d47db1743dfcff5bd595b57e3a143f2cb1)
> would really like a way to examine a buffer without risking an error
> if PageIsVerified() should happen to fail, but this patch is of
> absolutely no use in getting there, because CheckBuffer() doesn't give
> the caller any way to access the contents of the buffer. It can only
> do the checks that it knows how to do, and that's it. That doesn't
> seem like a good design.

Wouldn't this be better served by having a ReadBufferExtended() flag,
preventing erroring out and zeroing the buffer? I'm not sure that
handling both the case where the buffer contents need to be valid and
the one where it doesn't will make for a good API.

> I don't like the fact that CheckBuffer() silently skips dirty buffers,
> either. The comment should really say that it checks the state of a
> buffer without loading it into shared buffers, except sometimes it
> doesn't actually check it.

Yea, I don't see a good reason for that either. There's reasons for
dirty buffers that aren't WAL logged - so if the on-disk page is broken,
a standby taken outside pg_basebackup would possibly still end up with a
corrupt on-disk page. Similar with a crash restart.

> If the buffer is in shared buffers, we could take a share-lock
> on the buffer and copy the contents of the page as it exists on disk,
> and then still check it.

Don't think we need a share lock. That still allows the buffer to be
written out (and thus a torn read). What we need is to set
BM_IO_IN_PROGRESS on the buffer in question - only one backend can set
that. And then unset that again, without unsetting
BM_DIRTY/BM_JUST_DIRTIED.

> It feels really confusing to me that the user-exposed function here is
> called pg_relation_check_pages(). How is the user supposed to
> understand the difference between what this function does and what the
> new verify_heapam() in amcheck does? The answer is that the latter
> does far more extensive checks, but this isn't obvious from the SGML
> documentation, which says only that the blocks are "verified," as if
> an end-user can reasonably be expected to know what that means. It
> seems likely to lead users to the belief that if this function passes,
> they are in good shape, which is extremely far from being true. Just
> look at what PageIsVerified() checks compared to what verify_heapam()
> checks.

Yea I had similar thoughts, it should just be called
pg_checksum_verify_relation() or something.

> In fact, I would argue that this functionality ought to live in
> amcheck rather than core, though there could usefully be enabling
> functions in core.

I'm not really convinced by this though. It's not really AM
specific - works for all types of relations with storage; don't really
object either...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-11-02 19:47:48 Re: how to replicate test results in cf-bot on travis
Previous Message Konstantin Knizhnik 2020-11-02 19:20:01 Re: libpq compression