Re: pg_waldump: support decoding of WAL inside tarfile

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-03-22 11:24:56
Message-ID: CAD5tBcLVWKnph3iB-VPuPKR0dCckOJRFZW2-4H7HTTmhw8-vOg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 22, 2026 at 12:24 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > Unsurprisingly, applying this change to unmodified master results
> > in the pg_waldump and pg_verifybackup tests falling over. More
> > surprisingly, they still fall over after applying your fix to the
> > decompressors, so there's some other source of garbage trailing
> > data. I haven't figured out what.
>
> In the learn-something-new-every-day dept.: good ol' GNU tar itself
> does that. By default, it zero-pads its output to a multiple of 10kB
> after it's written the required terminator. Moreover, this behavior
> is actually specified by POSIX:
>
> -x format
> Specify the output archive format. The pax utility shall support
> the following formats:
> ...
> ustar
> The tar interchange format; see the EXTENDED DESCRIPTION
> section. The default blocksize for this format for character
> special archive files shall be 10240. Implementations shall
> support all blocksize values less than or equal to 32256 that
> are multiples of 512.
>
> So, astreamer_tar_parser_content's idea that it should disallow more
> than 1024 bytes of trailer is completely wrong, which we would have
> figured out long ago if the code attempting to enforce that weren't
> completely broken.
>
> You could argue that this means the tar files our existing utilities
> create aren't POSIX-compliant. I think it's all right though: we
> can just say that we write these files with blocksize 1024 not
> blocksize 10240, and tar-file readers are required to accept that
> per the above spec text.
>
> However, this discourages me from editorializing on the file trailer
> emitted by whatever wrote the tar file we are reading. I think
> emitting it as-is is the most appropriate thing. So we should just
> get rid of astreamer_tar_parser_content's nonfunctional error check
> and not change its behavior otherwise.
>
>
>
OK, patch 5 of this set does that. I reworked your previous patches 2 and 3
slightly - mostly additional comments, and fixing a bug in use
of sizeof(XLogLongPageHeader). Patch 4 here tries to fix the wrong use of
cur_file in get_archive_wal_entry()

cheers

andrew

Attachment Content-Type Size
v5-0003-Fix-init_archive_reader-to-not-depend-on-cur_file.patch text/x-patch 3.6 KB
v5-0001-Fix-finalization-of-decompressor-astreamers.patch text/x-patch 2.8 KB
v5-0004-Fix-get_archive_wal_entry-to-handle-cur_file-tran.patch text/x-patch 6.5 KB
v5-0002-Fix-failure-to-finalize-the-decompression-pipelin.patch text/x-patch 6.6 KB
v5-0005-Remove-nonfunctional-tar-file-trailer-size-check.patch text/x-patch 2.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2026-03-22 12:29:08 Re: Proposal: Prevent Primary/Standby SLRU divergence during MultiXact truncation
Previous Message Heikki Linnakangas 2026-03-22 11:09:46 Re: Bug in MultiXact replay compat logic for older minor version after crash-recovery