Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Date: 2022-04-26 02:53:54
Message-ID: Ymdewu9M1LABRzhz@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 25, 2022 at 10:11:10AM -0500, David Christensen wrote:
> On Mon, Apr 25, 2022 at 1:11 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> I don't think that there is any need to rely on a new logic if there
>> is already some code in place able to do the same work. See
>> verify_dir_is_empty_or_create() in pg_basebackup.c, as one example,
>> that relies on pg_check_dir(). I think that you'd better rely at
>> least on what pgcheckdir.c offers.
>
> Yeah, though I am tending towards what another user had suggested and
> just accepting an existing directory rather than requiring it to be
> empty, so thinking I might just skip this one. (Will review the file
> you've pointed out to see if there is a relevant function though.)

OK. FWIW, pg_check_dir() is used in initdb and pg_basebackup because
these care about the behavior to use when specifying a target path.
You could reuse it, but use a different policy depending on its
returned result for the needs of what you see fit in this case.

>> + PageSetLSN(page, record->ReadRecPtr);
>> + /* if checksum field is non-zero then we have checksums enabled,
>> + * so recalculate the checksum with new LSN (yes, this is a hack)
>> + */
>> Yeah, that looks like a hack, but putting in place a page on a cluster
>> that has checksums enabled would be more annoying with
>> zero_damaged_pages enabled if we don't do that, so that's fine by me
>> as-is. Perhaps you should mention that FPWs don't have their
>> pd_checksum updated when written.
>
> Can make a mention; you thinking just a comment in the code is
> sufficient, or want there to be user-visible docs as well?

I was thinking about a comment, at least.

> This was to make the page as extracted equivalent to the effect of
> applying the WAL record block on replay (the LSN and checksum both);
> since recovery calls this function to mark the LSN where the page came
> from this is why I did this as well. (I do see perhaps a case for
> --raw output that doesn't munge the page whatsoever, just made
> comparisons easier.)

Hm. Okay. The argument goes both ways, I guess, depending on what we
want to do with those raw pages. Still you should not need pd_lsn if
the point is to be able to stick the pages back in place to attempt to
get back as much data as possible when loading it back to the shared
buffers?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-04-26 04:46:55 Re: [Proposal] vacuumdb --schema only
Previous Message Michael Paquier 2022-04-26 02:42:49 Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL