From: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Michael Banck <michael(dot)banck(at)credativ(dot)de>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [patch] Fix checksum verification in base backups for zero page headers |
Date: | 2020-10-21 21:47:03 |
Message-ID: | 69b38451-e91f-95fc-c3ce-06eb778d36e0@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 20.10.2020 09:24, Michael Paquier wrote:
> On Fri, Oct 16, 2020 at 06:02:52PM +0300, Anastasia Lubennikova wrote:
>> In the current patch, PageIsVerifed called from pgbasebackup simply doesn't
>> Should we change this too? I don't see any difference.
> I considered that, but now that does not seem worth bothering here.
>
>> Done.
> Thanks for the updated patch. I have played with dd and some fake
> files to check the validity of the zero-case (as long as pd_lsn does
> not go wild). Here are some comments, with an updated patch attached:
> - Added a PageIsVerifiedExtended to make this patch back-patchable,
> with the original routine keeping its original shape.
Thank you. I always forget about this. Do we have any checklist for such
changes, that patch authors and reviewers can use?
> - I did not like much to show the WARNING from PageIsVerified() for
> the report, and we'd lose some context related to the base backups.
> The important part is to know which blocks from which files are found
> as problematic.
> - Switched the checks to have PageIsVerified() placed first in the
> hierarchy, so as we do all the basic validity checks before a look at
> the LSN. This is not really change things logically.
> - The meaning of block_retry is also the opposite of what it should be
> in the original code? IMO, the flag should be set to true if we still
> are fine to retry a block, and set it to false once the one-time retry
> has failed.
Agree.
> - The error strings are not really consistent with the project style
> in this area. These are usually not spawned across multiple lines to
> ease searches with grep or such.
Same question as above. I don't see this info about formatting in the
error message style guide in documentation. Is it mentioned somewhere else?
> Anastasia, Michael B, does that look OK to you?
The final patch looks good to me.
> NB: This patch fixes only one problem, the zero-page case, as it was
> the intention of Michael B to split this part into its own thread. We
> still have, of course, a second problem when it comes to a random LSN
> put into the page header which could mask an actual checksum failure
> so this only takes care of half the issues. Having a correct LSN
> approximation is a tricky problem IMO, but we could improve things by
> having some delta with an extra WAL page on top of GetInsertRecPtr().
> And this function cannot be used for base backups taken from
> standbys.
We can also read such pages via shared buffers to be 100% sure.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2020-10-21 22:25:51 | Re: [DOC] Document concurrent index builds waiting on each other |
Previous Message | Thomas Munro | 2020-10-21 21:22:32 | Re: language cleanups in code and docs |