Re: pg_waldump: support decoding of WAL inside tarfile

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-04-02 23:43:33
Message-ID: CAA5RZ0tt89MgNi4-0F4onH+-TFSsysFjMM-tBc6aXbuQv5xBXw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> >> So I think we need something like the attached, in addition
> >> to what I sent before. This just makes astreamer_tar.c use
> >> the isValidTarHeader function that pg_dump already had.
>
> > LGTM.
>
> Pushed, thanks for reviewing! In the event I decided to back-patch to
> v18, where these fixes could protect pg_verifybackup against tar files
> it can't handle.

Hi,

I just encountered a regression test failure for pg_waldump due to ebba64c08d9.

````
――――――――――――――――――――――――――――――――――――――――――――――――――――― ✀
――――――――――――――――――――――――――――――――――――――――――――――――――――――
Listing only the last 100 lines from a long log.
# at /local/home/simseih/pgdev/installations/worktrees/dev/src/bin/pg_waldump/t/001_basic.pl
line 432.
# got: 'pg_waldump: error: could not find WAL in archive
"pg_wal.tar.gz"
# '
# expected: ''
```

and regress_log_001_basic shows this:

```
# Running: /usr/bin/tar --format=ustar -cf /tmp/ja26rXZOnb/pg_wal.tar
archive_status 000000010000000000000002 000000010000000000000001
summaries 000000010000000000000003
[22:25:00.525](0.008s) not ok 101
[22:25:00.525](0.000s) # Failed test at
/local/home/simseih/pgdev/installations/worktrees/dev/src/bin/pg_waldump/t/001_basic.pl
line 350.
[22:25:00.525](0.000s) # ---------- command failed ----------
[22:25:00.526](0.000s) # /usr/bin/tar --format=ustar -cf
/tmp/ja26rXZOnb/pg_wal.tar archive_status 000000010000000000000002
000000010000000000000001 summaries 000000010000000000000003
[22:25:00.526](0.000s) # -------------- stderr --------------
[22:25:00.526](0.000s) # /usr/bin/tar: value 10012663 out of uid_t
range 0..2097151
```

The --format=ustar has a limit of 2^21 (2097151) for UID/GID [1]
and on my machine the UID is 10012663.

So I found that one way to deal with this is to run the tar command with
--owner=0 --group=0. As far as I can tell, the owner and group IDs don't
matter for these tests, so maybe that is OK.

@@ -1333,6 +1333,10 @@ sub tar_portability_options
== 0)
{
push(@tar_p_flags, "--format=ustar");
+ # ustar format supports UIDs only up to 2^21 (2097151).
+ # Override owner/group to avoid failures on systems where
+ # the running user's UID/GID exceeds that limit.
+ push(@tar_p_flags, "--owner=0", "--group=0");
}

While this fixes the test, I am now not sure what the broader implications are
for --format=ustar for pg_waldump in the broader discussion?

[1] [https://www.gnu.org/software/tar/manual/html_section/Formats.html]

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2026-04-02 23:49:21 Re: pg_waldump: support decoding of WAL inside tarfile
Previous Message Masahiko Sawada 2026-04-02 23:30:42 Re: POC: Parallel processing of indexes in autovacuum