Re: pg_waldump: support decoding of WAL inside tarfile

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-04-02 23:49:21
Message-ID: CA+hUKGKfti_FMFuduXEZs96W5Boce9gSLZ5Ei158dFiuLuWLgA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 3, 2026 at 11:50 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > How about using --format=ustar, instead of that sparse control stuff?
>
> I did it that way for GNU tar, but did not research whether bsdtar
> will take that option. Feel free to hack on ebba64c08 some more.
>
> (It seems though that the two tars' locutions for "write to stdout"
> are different, so we might have to have separate tests even if they
> end up pushing the same option.)

I have:

$ tar --version
bsdtar 3.8.2 - libarchive 3.8.2 zlib/1.3.1 liblzma/5.8.1 libzstd/1.5.2
openssl/3.5.4 libb2/bundled
$ gtar --version
tar (GNU tar) 1.35
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.

This seems to work for both:

$ tar --format=ustar -c /dev/null > /dev/null
tar: Removing leading '/' from member names
$ gtar --format=ustar -c /dev/null > /dev/null
gtar: Removing leading `/' from member names

The attached passes with both, and regress_log_001_basic looks like:

# Running: /usr/bin/tar --format=ustar -cf /tmp/J_ifbfUOSd/pg_wal.tar
archive_status 000000010000000000000001 000000010000000000000003
000000010000000000000002 summaries
[12:12:24.301](0.072s) ok 101

# Running: /usr/local/bin/gtar --format=ustar -cf
/tmp/pbdsHdrAdw/pg_wal.tar 000000010000000000000002 archive_status
000000010000000000000003 summaries 000000010000000000000001
[12:18:14.739](0.050s) ok 101

I think a Windows system could be using either. BSD tar comes
pre-installed by Microsoft and people often install GNU tools. So I
think we should use File::Spec->devnull() instead of /dev/null, and
Andrew showed that working. I doubt Windows is capable of making
sparse files (except perhaps with ReFS?), but it's nice to use the
same code everywhere and future-proof in case GNU carries out its
thread to switch to pax by default. Windows probably has file
attributes that ustar can't represent (?), so I guess that might
motivate it to use pax headers if they are indeed added only when
needed.

Longer term I think we need to tolerate but ignore pax headers. If I
understand the spirit of this long evolution, pax archives are
intended to be acceptable to pre-pax implementations, which implies
that they can't really change the meaning of the bits of the file
contents. That's why GNU's --sparse hides funky file encodings from
old tars by renaming them to GNUSparseFile.%p/%f, and that leads back
to my original suggestion that we should figure out how to detect and
reject pax only if we failed to find the file under the expected name.
(Or of course we could just implement support for that, and I have a
half-baked trial patch for that but now is not the time.)

Attachment Content-Type Size
0001-Harmonize-tar-option-tests-from-ebba64c0.patch text/x-patch 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message surya poondla 2026-04-02 23:51:09 Re: heapam_tuple_complete_speculative : remove unnecessary tuple fetch
Previous Message Sami Imseih 2026-04-02 23:43:33 Re: pg_waldump: support decoding of WAL inside tarfile