Re: Online verification of checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Online verification of checksums
Date: 2019-03-19 21:44:52
Message-ID: 20190319214452.7ithifs7iy5yxvsi@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-03-19 22:39:16 +0100, Michael Banck wrote:
> Am Dienstag, den 19.03.2019, 13:00 -0700 schrieb Andres Freund:
> > a) checks that the page is all zeroes if PageIsNew() (like
> > PageIsVerified() does for the backend). That avoids missing cases
> > where corruption just zeroed out the header, but not the whole page.
>
> We can't run pg_checksum_page() on those afterwards though as it would
> fire an assertion:
>
> |pg_checksums: [...]/../src/include/storage/checksum_impl.h:194:
> |pg_checksum_page: Assertion `!(((PageHeader) (&cpage->phdr))->pd_upper
> |== 0)' failed.
>
> But we should count it as a checksum error and generate an appropriate
> error message in that case.

All I'm saying is that if PageIsNew() you need to run the same checks
that PageIsVerified() runs in that case. Namely verifying that the page
is all-zeroes, rather than just the pd_upper field. That's separate
from running pg_checksum_page().

> > b) Check that pd_lsn is between startlsn and the insertion pointer. That
> > avoids accepting just about all random data.
>
> However, for pg_checksums being a stand-alone application it can't just
> access the insertion pointer, can it? We could maybe set a threshold
> from the last checkpoint after which we consider the pd_lsn bogus. But
> what's a good threshold here?

That's *PRECISELY* my point. I think it's a bad idea to do online
checksumming from outside the backend. It needs to be inside the
backend, and if there's any verification failures on a block, it needs
to acquire the IO lock on the page, and reread from disk.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2019-03-19 22:13:28 Re: Proposal to suppress errors thrown by to_reg*()
Previous Message Michael Banck 2019-03-19 21:39:16 Re: Online verification of checksums