From: | Amul Sul <sulamul(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
Date: | 2025-09-25 08:24:51 |
Message-ID: | CAAJ_b94Uh+b41LQG45bZFK+i62EVvv972LiGWWWuR64=-64rTQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 12, 2025 at 11:58 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Here are some review comments on v3-0004:
>
Thanks for the review. My replies are below.
> There doesn't seem to be any reason for
> astreamer_waldump_content_new() to take an astreamer *next argument.
> If you look at astreamer.h, you'll see that some astreamer_BLAH_new()
> functions take such an argument, and others don't. The ones that do
> forward their input to another astreamer; the ones that don't, like
> astreamer_plain_writer_new(), send it somewhere else. AFAICT, this
> astreamer is never going to send its output to another astreamer, so
> there's no reason for this argument.
>
Done.
> I'm also a little confused by the choice of the name
> astreamer_waldump_content_new(). I would have thought this would be
> something like astreamer_waldump_new() or astreamer_xlogreader_new().
> The word "content" doesn't seem to me to be adding much here, and it
> invites confusion with the "content" callback.
>
Done -- renamed to astreamer_waldump_new().
> I think you can merge setup_astreamer() into
> init_tar_archive_reader(). The only other caller is
> verify_tar_archive(), but that does exactly the same additional steps
> as init_tar_archive_reader(), as far as I can see.
>
Done.
> The return statement for astreamer_wal_read is really odd:
>
> + return (count - nbytes) ? (count - nbytes) : -1;
>
Agreed, that's a bit odd. This seems to be leftover code from the experimental
patch. The astreamer_wal_read() function should behave like WALRead():
it should either successfully read all the requested bytes or throw an
error. Corrected in the attached version.
>
> I would suggest changing the name of the variable from "readBuff" to
> "readBuf". There are no existing uses of readBuff in the code base.
>
The existing WALDumpReadPage() function has a "readBuff" argument, and
I've used it that way for consistency.
> I think this comment also needs improvement:
>
> + /*
> + * Ignore existing data if the required target page
> has not yet been
> + * read.
> + */
> + if (recptr >= endPtr)
> + {
> + len = 0;
> +
> + /* Reset the buffer */
> + resetStringInfo(astreamer_buf);
> + }
>
> This comment is problematic for a few reasons. First, we're not
> ignoring the existing data: we're throwing it out. Second, the comment
> doesn't say why we're doing what we're doing, only that we're doing
> it. Here's my guess at the actual explanation -- please correct me if
> I'm wrong: "pg_waldump never reads the same WAL bytes more than once,
> so if we're now being asked for data beyond the end of what we've
> already read, that means none of the data we currently have in the
> buffer will ever be consulted again. So, we can discard the existing
> buffer contents and start over." By the way, if this explanation is
> correct, it might be nice to add an assertion someplace that verifies
> it, like asserting that we're always reading from an LSN greater than
> or equal to (or exactly equal to?) the LSN immediately following the
> last data we read.
>
Updated the comment. The similar assertion exists right before
copying to the readBuff.
>
> Another thing that isn't so nice right now is that
> verify_tar_archive() has to open and close the archive only for
> init_tar_archive_reader() to be called to reopen it again just moments
> later. It would be nicer to open the file just once and then keep it
> open. Here again, I wonder if the separation of duties could be a bit
> cleaner.
>
Prefer to keep those separate, assuming that reopening the file won't
cause any significant harm. Let me know if you think otherwise.
Attached the updated version, kindly have a look.
Regards,
Amul
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch | application/x-patch | 2.3 KB |
v4-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch | application/x-patch | 2.3 KB |
v4-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch | application/x-patch | 5.5 KB |
v4-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch | application/x-patch | 34.9 KB |
v4-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch | application/x-patch | 19.3 KB |
v4-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch | application/x-patch | 1.7 KB |
v4-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch | application/x-patch | 15.6 KB |
v4-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch | application/x-patch | 9.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-09-25 08:25:23 | Re: Optimize LISTEN/NOTIFY |
Previous Message | Shubham Khanna | 2025-09-25 08:18:40 | Re: Add support for specifying tables in pg_createsubscriber. |