Re: Strengthen pg_waldump's --save-fullpage tests

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Strengthen pg_waldump's --save-fullpage tests
Date: 2023-01-11 13:47:47
Message-ID: CALj2ACUx-W07Tf9cV3pdgfd750BNVe3MbuQ2X-TzYE3VJR_kfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 11, 2023 at 3:28 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi,
>
> On 1/11/23 5:17 AM, Bharath Rupireddy wrote:
> > On Wed, Jan 11, 2023 at 6:32 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >>
> >> On Tue, Jan 10, 2023 at 05:25:44PM +0100, Drouvot, Bertrand wrote:
> >>> I like the idea of comparing the full page (and not just the LSN) but
> >>> I'm not sure that adding the pageinspect dependency is a good thing.
> >>>
> >>> What about extracting the block directly from the relation file and
> >>> comparing it with the one extracted from the WAL? (We'd need to skip the
> >>> first 8 bytes to skip the LSN though).
> >>
> >> Byte-by-byte counting for the page hole?
>
> I've in mind to use diff on the whole page (minus the LSN).
>
> >> The page checksum would
> >> matter as well,
>
> Right, but the TAP test is done without checksum and we could also
> skip the checksum from the page if we really want to.
>
> > Right. LSN of FPI from the WAL record and page from the table won't be
> > the same, essentially FPI LSN <= table page.
>
> Right, that's why I proposed to exclude it for the comparison.
>
> What about something like the attached?

Note that the raw page on the table might differ not just in page LSN
but also in other fields, for instance see heap_mask for instance. It
masks lsn, checksum, hint bits, unused space etc. before verifying FPI
consistency during recovery in
verifyBackupPageConsistency().

I think the job of verifying FPI from WAL record with the page LSN is
better left to the core - via verifyBackupPageConsistency(). Honestly,
pg_waldump is good with what it has currently - LSN checks.

+# Extract the binary data without the LSN from the relation's block
+sysseek($frel, 8, 0); #bypass the LSN
+sysread($frel, $blk, 8184) or die "sysread failed: $!";
+syswrite($blkfrel, $blk) or die "syswrite failed: $!";

I suspect that these tests are portable with the hardcoded values such as above.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-01-11 14:08:46 Re: Rework of collation code, extensibility
Previous Message Bharath Rupireddy 2023-01-11 13:29:18 Re: Add a new pg_walinspect function to extract FPIs from WAL records