Quick Links

Re: pg_waldump: support decoding of WAL inside tarfile

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Amul Sul <sulamul(at)gmail(dot)com>
Cc:	Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pg_waldump: support decoding of WAL inside tarfile
Date:	2026-01-27 21:11:26
Message-ID:	CA+TgmoYuC-92tMxZHn-KY=ExF-YkHkd4yOfiMhi9upqk3M1DuA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> In the attached version, I am using the WAL segment name as the hash
> key, which is much more straightforward. I have rewritten
> read_archive_wal_page(), and it looks much cleaner than before. The
> logic to discard irrelevant WAL files is still within
> get_archive_wal_entry. I added an explanation for setting cur_wal to
> NULL, which is now handled in the separate function I mentioned
> previously.
>
> Kindly have a look at the attached version; let me know if you are
> still not happy with the current approach for filtering/discarding
> irrelevant WAL segments. It isn't much different from the previous
> version, but I have tried to keep it in a separate routine for better
> code readability, with comments to make it easier to understand. I
> also added a comment for ArchivedWALFile.

I feel like the division of labor between get_archive_wal_entry() and
read_archive_wal_page() is odd. I noticed this in the last version,
too, and it still seems to be the case. get_archive_wal_entry() first
calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
If it doesn't, it loops until an entry for the requested file shows up
and then returns it. Then control returns to read_archive_wal_page()
which loops some more until we have all the data we need for the
requested file. But it seems odd to me to have two separate loops
here. I think that the first loop is going to call read_archive_file()
until we find the beginning of the file that we care about and then
the second one is going to call read_archive_file() some more until we
have read enough of it to satisfy the request. It feels odd to me to
do it that way, as if we told somebody to first wait until 9 o'clock
and then wait another 30 minutes, instead of just telling them to wait
until 9:30. I realize it's not quite the same thing, because apart
from calling read_archive_file(), the two loops do different things,
but I still think it looks odd.

+ /*
+ * Ignore if the timeline is different or the current segment is not
+ * the desired one.
+ */
+ XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
+ if (privateInfo->timeline != curSegTimeline ||
+ privateInfo->startSegNo > curSegNo ||
+ privateInfo->endSegNo < curSegNo ||
+ segno > curSegNo)
+ {
+ free_archive_wal_entry(entry->fname, privateInfo);
+ continue;
+ }

The comment doesn't match the code. If it did, the test would be
(privateInfo->timeline != curSegTimeline || segno != curSegno). But
instead the segno test is > rather than !=, and the checks against
startSegNo and endSegNo aren't explained at all. I think I understand
why the segno test uses > rather than !=, but it's the point of the
comment to explain things like that, rather than leaving the reader to
guess. And I don't know why we also need to test startSegNo and
endSegNo.

I also wonder what the point is of doing XLogFromFileName() on the
fname provided by the caller and then again on entry->fname. Couldn't
you just compare the strings?

Again, the division of labor is really odd here. It's the job of
astreamer_waldump_content() to skip things that aren't WAL files at
all, but it's the job of get_archive_wal_entry() to skip things that
are WAL files but not the one we want. I disagree with putting those
checks in completely separate parts of the code.

+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(member->pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}

I don't think this code is fully correct. Note that pg_verifybackup's
member_verify_header() prepends "./" and then applies
canonicalize_path(). I think the same considerations apply here.
Consider:

[rhaas pgdata]$ (cd pg_wal; tar -cf ../pg_wal1.tar 00*)
[rhaas pgdata]$ (cd pg_wal; tar -cf ../pg_wal2.tar .)
[rhaas pgdata]$ tar tf pg_wal1.tar
00000001000000000000002C
00000001000000000000002D
00000001000000000000002E
00000001000000000000002F
000000010000000000000030
000000010000000000000031
000000010000000000000032
000000010000000000000033
[rhaas pgdata]$ tar tf pg_wal2.tar
./
./00000001000000000000002C
./00000001000000000000002D
./00000001000000000000002E
./archive_status/
./000000010000000000000031
./000000010000000000000030
./summaries/
./00000001000000000000002F
./000000010000000000000032
./000000010000000000000033

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Re: pg_waldump: support decoding of WAL inside tarfile at 2026-01-27 12:06:34 from Amul Sul

Responses

Re: pg_waldump: support decoding of WAL inside tarfile at 2026-02-04 13:09:10 from Amul Sul

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2026-01-27 21:42:35	Re: AIX support
Previous Message	Masahiko Sawada	2026-01-27 20:35:30	Re: pg_upgrade: optimize replication slot caught-up check