From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | Re: Online checksums verification in the backend |
Date: | 2020-10-27 07:07:46 |
Message-ID: | 20201027070746.GD28445@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 23, 2020 at 06:06:30PM +0900, Michael Paquier wrote:
> On Fri, Oct 23, 2020 at 04:31:56PM +0800, Julien Rouhaud wrote:
>> Mmm, is it really an improvement to report warnings during this
>> function execution? Note also that PageIsVerified as-is won't report
>> a warning if a page is found as PageIsNew() but isn't actually all
>> zero, while still being reported as corrupted by the SRF.
>
> Yep, joining the point of above to just have no WARNINGs at all.
Now that we have d401c57, I got to consider more this one, and opted
for not generating a WARNING for now. Hence, PageisVerifiedExtended()
is disabled regarding that, but we still report a checksum failure in
it.
I have spent some time reviewing the tests, and as I felt this was
bulky. In the reworked version attached, I have reduced the number of
tests by half, without reducing the coverage, mainly:
- Removed all the stderr and the return code tests, as we always
expected the commands to succeed, and safe_psql() can do the job
already.
- Merged of the queries using pg_relation_check_pages into a single
routine, with the expected output (set of broken pages returned in the
SRF) in the arguments.
- Added some prefixes to the tests, to generate unique test names.
That makes debug easier.
- The query on pg_stat_database is run once at the beginning, once at
the end with the number of checksum failures correctly updated.
- Added comments to document all the routines, and renamed some of
them mostly for consistency.
- Skipped system relations from the scan of pg_class, making the test
more costly for nothing.
- I ran some tests on Windows, just-in-case.
I have also added a SearchSysCacheExists1() to double-check if the
relation is missing before opening it, added a
CHECK_FOR_INTERRUPTS() within the main check loop (where the error
context is really helpful), indented the code, bumped the catalogs
(mostly a self-reminder), etc.
So, what do you think?
--
Michael
Attachment | Content-Type | Size |
---|---|---|
online-checksum-v19.patch | text/x-diff | 28.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nikolay Samokhvalov | 2020-10-27 07:12:00 | Re: Add important info about ANALYZE after create Functional Index |
Previous Message | Craig Ringer | 2020-10-27 07:07:22 | Re: Internal key management system |