Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

From: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Date: 2022-11-17 16:32:05
Message-ID: CAOxo6XLhduEAn0yLh+BGOjpq6k0x9MsOY6PjDEoCh4VvdUxCiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 16, 2022 at 3:30 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> 1.
> - if (config.filter_by_fpw && !XLogRecordHasFPW(xlogreader_state))
> + if (config.filter_by_fpw && !XLogRecordHasFPI(xlogreader_state))
> These changes are not related to this feature, hence renaming those
> variables/function names must be dealt with separately. If required,
> proposing another patch can be submitted to change filter_by_fpw to
> filter_by_fpi and XLogRecordHasFPW() to XLogRecordHasFPI().

Not required; can revert the changes unrelated to this specific patch.
(I'd written the original ones for it, so didn't really think anything
of it... :-))

> 2.
> + /* We fsync our output directory only; since these files are not part
> + * of the production database we do not require the performance hit
> + * that fsyncing every FPI would entail, so are doing this as a
> + * compromise. */
> The commenting style doesn't match the standard that we follow
> elsewhere in postgres, please refer to other multi-line comments.

Will fix.

> 3.
> + fsync_fname(config.save_fpi_path, true);
> + }
> It looks like fsync_fname()/fsync() in general isn't recursive, in the
> sense that it doesn't fsync the files under the directory, but the
> directory only. So, the idea of directory fsync doesn't seem worth it.
> We either 1) get rid of fsync entirely or 2) fsync all the files after
> they are created and the directory at the end or 3) do option (2) with
> --no-sync option similar to its friends. Since option (2) is a no go,
> we can either choose option (1) or option (2). My vote at this point
> is for option (1).

Agree to remove.

> 4.
> +($walfile_name, $blocksize) = split '\|' =>
> $node->safe_psql('postgres',"SELECT pg_walfile_name(pg_switch_wal()),
> current_setting('block_size')");
> +my $walfile = $node->basedir . '/pgdata/pg_wal/' . $walfile_name;
> I think there's something wrong with this, no? pg_switch_wal() can, at
> times, return end+1 of the prior segment (see below snippet) and I'm
> not sure if such a case can happen here.
>
> * The return value is either the end+1 address of the switch record,
> * or the end+1 address of the prior segment if we did not need to
> * write a switch record because we are already at segment start.
> */
> XLogRecPtr
> RequestXLogSwitch(bool mark_unimportant)

I think this approach is pretty common to get the walfile name, no?
While there might be an edge case here, since the rest of the test is
a controlled environment I'm inclined to just not worry about it; this
would require the changes prior to this to exactly fill a WAL segment
which strikes me as extremely unlikely to the point of impossible in
this specific scenario.

> 5.
> +my $walfile = $node->basedir . '/pgdata/pg_wal/' . $walfile_name;
> +ok(-f $walfile, "Got a WAL file");
> Is this checking if the WAL file is present or not in PGDATA/pg_wal?
> If yes, I think this isn't required as pg_switch_wal() ensures that
> the WAL is written and flushed to disk.

You are correct, probably another artifact of the earlier version.
That said, not sure I see the harm in keeping it as a sanity-check.

> 6.
> +my $walfile = $node->basedir . '/pgdata/pg_wal/' . $walfile_name;
> Isn't "pgdata" hardcoded here? I think you might need to do the following:
> $node->data_dir . '/pg_wal/' . $walfile_name;;

Can fix.

> 7.
> + # save filename for later verification
> + $files{$file}++;
>
> +# validate that we ended up with some FPIs saved
> +ok(keys %files > 0, 'verify we processed some files');
> Why do we need to store filenames in an array when we later just check
> the size of the array? Can't we use a boolean (file_found) or an int
> variable (file_count) to verify that we found the file.

Another artifact; we were comparing the files output between two
separate lists of arbitrary numbers of pages being written out and
verifying the raw/fixup versions had the same lists.

> 8.
> +$node->safe_psql('postgres', <<EOF);
> +SELECT 'init' FROM
> pg_create_physical_replication_slot('regress_pg_waldump_slot', true,
> false);
> +CREATE TABLE test_table AS SELECT generate_series(1,100) a;
> +CHECKPOINT; -- required to force FPI for next writes
> +UPDATE test_table SET a = a + 1;
> +EOF
> The EOF with append_conf() is being used in 4 files and elsewhere in
> the TAP test files (more than 100?) qq[] or quotes is being used. I
> have no strong opinion here, I'll leave it to the other reviewers or
> committer.

I'm inclined to leave it just for (personal) readability, but can
change if there's a strong consensus against.

> > > 11.
> > > +# verify filename formats matches w/--save-fpi
> > > +for my $fullpath (glob "$tmp_folder/raw/*")
> > > Do we need to look for the exact match of the file that gets created
> > > in the save-fpi path? While checking for this is great, it makes the
> > > test code non-portable (may not work on Windows or other platforms,
> > > no?) and complex? This way, you can get rid of get_block_info() as
> > > well? And +for my $fullpath (glob "$tmp_folder/raw/*")
> > > will also get simplified.
> > >
> > > I think you can further simplify the tests by:
> > > create the node
> > > generate an FPI
> > > call pg_waldump with save-fpi option
> > > check the target directory for a file that contains the relid,
> > > something like '%relid%'.
> > >
> > > The above would still serve the purpose, tests the code without much complexity.
> >
> > I disagree; I think there is utility in keeping the validation of the
> > expected output. Since we have the code that works for it (and does
> > work on Windows, per passing the CI tests) I'm not seeing why we
> > wouldn't want to continue to validate as much as possible.
>
> My intention is to simplify the tests further and I still stick to it.
> It looks like the majority of test code is to form the file name in
> the format that pg_waldump outputs and match with the file name in the
> target directory - for instance, in get_block_info(), and in the loop
> for my $fullpath (glob "$tmp_folder/raw/*"). I don't think the tests
> need to aim for file format checks, it's enough to look for the
> written file with '%relid%' by pg_waldump, if needed, the contents of
> the files written/FPI can also be verified with, say, pg_checksum
> tool. Others may have different opinions though.

I would like to get broader feedback before changing this.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Christensen 2022-11-17 16:34:48 Re: Moving forward with TDE
Previous Message Andres Freund 2022-11-17 16:27:12 Re: allow segment size to be set to < 1GiB