Quick Links

Re: pg_waldump: support decoding of WAL inside tarfile

From:	Amul Sul <sulamul(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pg_waldump: support decoding of WAL inside tarfile
Date:	2026-01-23 12:27:11
Message-ID:	CAAJ_b95FOeW38gw-3BLmpdnTWHFimopTvf=eTObYUbTOC0x8qg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Jan 19, 2026 at 7:01 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> > But, I am thinking that instead of setting privateInfo->cur_wal to
> > NULL, we could simply discard the buffer data at that place and let
> > memcpy in astreamer_content copy if it would be that much of an issue.
>
> I don't think doing unnecessary copying is the right way forward. The
> copying itself could be expensive, but I am a little concerned about
> the memory utilization of this code. Suppose the user has increased
> the WAL segment size to 1GB or even higher. It seems like we could
> buffer a whole segment or maybe even more in some scenarios. If we
> avoid copying data we don't need, then we also avoid buffering it in
> memory.
>
> In terms of the separation of concerns, we could view setting
> privateInfo->cur_wal = NULL as a form of signaling, a way for this
> code to tell the astreamer that it doesn't need the data buffering.
> However, I think it might be better to make the signaling more
> explicit. Instead of having the caller directly set the buffer to
> NULL, or directly trim data out of the buffer, maybe it should set
> some value in privateInfo that tells the astreamer what to do. For
> instance, suppose it sets the oldest LSN that it might still care
> about in privateInfo, and then the astreamer is free to do skip
> copying of any data prior to that LSN, and discard any that it already
> has. Especially if properly commented, I think this might be more
> clear than what you have now. I am not 100% sure that's the right
> idea, though. I just think right now it's a bit murky who does what
> and why the division of responsibilities is what it is.
>

The current implementation of astreamer_waldump_content() does not
have sufficient information to skip WAL segments during the initial
hash entry preparation and data-copying phase. Because the filtration
parameters -- which determine if a segment should be skipped -- depend
on the WAL segment size, we must first read a WAL page through the
streamer to calculate that size, which is done in
init_archive_reader(). Therefore, the responsibility of the archive
streamer is strictly to copy the WAL segment data to the buffer.

The skipping decision is handled inside get_archive_wal_entry(), which
sets privateInfo->cur_wal to NULL. In the next version, I am planning
to add a separate routine (with better commenting) that, along with
setting the pointer to NULL, releases that hash entry to avoid
unnecessary memory usage.

Another option I previously considered was adding the filtration logic
inside the archive streamer itself. However, since the very first read
is required to calculate the WAL segment size, the filter check cannot
be performed immediately. However, we could send a signal to the
archive streamer via privateInfo (e.g., a read_any_wal or
skip_wal_check boolean flag) to disable the filtration check until the
size is calculated. But that approach isn't very elegant; if the first
WAL page we read belongs to a segment we actually want to skip, we
would still have to run the filter check and handle the skip/removal
logic outside of the streamer (i.e., inside init_archive_reader()).
This would result in performing the same filtration check in two
different places.

Therefore, I believe performing the filtration check through
get_archive_wal_entry() and then calling a routine to clear
privateInfo->cur_wal and the hash entry is the better approach, IMO.
Additionally, once we consume a WAL file and move to the next one, the
hash entry and buffer for that WAL can be released to prevent
unnecessary memory consumption by calling the same routine that I am
planning to add.

> > > + if (strstr(member->pathname, "PaxHeaders."))
> > > + return false;
> > >
> > > There is no way that a filename containing the string "PaxHeaders."
> > > could ever pass the IsXLogFileName test just above. We shouldn't need
> > > this.
> > >
> >
> > This checks the directory name (e.g.,
> > x_dir/y_dir/PaxHeaders.NNNN/wal_file_name). The name of that metadata
> > file is exactly the same as the WAL file name, which is why
> > IsXLogFileName() doesn't help here.
>
> I think this code should only be considering files in the toplevel
> directory, and skipping over any directories it finds. I absolutely
> promise I am not going to commit anything that is specifically looking
> for PaxHeaders. Nothing we've ever done with tar files up to now has
> required that, and I don't think this should, either.

Ok, fair enough, my intention was to allow decoding of valid WAL data
from any directory in the tar archive, but I will go ahead and add
that restriction as suggested.

Regards,
Amul

In response to

Re: pg_waldump: support decoding of WAL inside tarfile at 2026-01-19 13:30:48 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2026-01-23 12:30:15	Re: Is abort() still needed in WalSndShutdown()?
Previous Message	Jim Jones	2026-01-23 12:19:25	Re: WIP - xmlvalidate implementation from TODO list