Re: pg_waldump: support decoding of WAL inside tarfile

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-04-01 18:25:40
Message-ID: 3049460.1775067940@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> On Mon, Mar 30, 2026 at 11:23 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I agree that this isn't too critical if the effects are confined to
>> pg_waldump. I believe that pg_basebackup and pg_verifybackup also use
>> astreamer_tar.c, but it's not clear to me if they'd ever be asked to
>> parse files made by tar(1) and not by our own sparseness-ignorant
>> tar-writing code. If they can be, that'd be a higher-priority reason
>> to fill in this gap.

> Yeah I can't see any reason why pg_verifybackup --wal-path=foo.tar
> won't suffer the same problem in the wild. Again, it's not the end of
> the world because it'll just fail and you'll probably eventually
> figure out why. So perhaps we should just improve our detection of
> archives that we can't handle?

After reading the POSIX spec for pax format (in the pax(1) man page),
I think it's absolutely essential that we reject files that contain
pax extension headers. Those can change the interpretation of the
following file header(s) in nearly arbitrary ways, so we have plenty
of problems besides this sparse-file issue if we just ignore them.

(Of course, later we can consider improving the code to handle them
correctly, but that ain't happening in time for v19.)

Also, if we are admitting the possibility that what we are reading
was made by a platform-supplied tar and not our own code, I think
it verges on lunacy to behave as though unsupported typeflags are
regular files.

So I think we need something more or less like the attached.

regards, tom lane

Attachment Content-Type Size
v1-tighten-tar-typeflag-handling.patch text/x-diff 5.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2026-04-01 18:41:01 Re: table AM option passing
Previous Message Heikki Linnakangas 2026-04-01 18:17:12 Re: Better shared data structure management and resizable shared data structures