| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
| Date: | 2026-04-03 00:11:37 |
| Message-ID: | 3686764.1775175097@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> On Fri, Apr 3, 2026 at 11:50 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> How about using --format=ustar, instead of that sparse control stuff?
>> I did it that way for GNU tar, but did not research whether bsdtar
>> will take that option. Feel free to hack on ebba64c08 some more.
> This seems to work for both:
> $ tar --format=ustar -c /dev/null > /dev/null
> tar: Removing leading '/' from member names
> $ gtar --format=ustar -c /dev/null > /dev/null
> gtar: Removing leading `/' from member names
Cool. LGTM.
> I think a Windows system could be using either. BSD tar comes
> pre-installed by Microsoft and people often install GNU tools. So I
> think we should use File::Spec->devnull() instead of /dev/null, and
> Andrew showed that working.
Agreed.
> Longer term I think we need to tolerate but ignore pax headers. If I
> understand the spirit of this long evolution, pax archives are
> intended to be acceptable to pre-pax implementations, which implies
> that they can't really change the meaning of the bits of the file
> contents.
I don't buy that. For example, POSIX specifies these allowed
fields in an extended header:
linkpath
The pathname of a link being created to another file, of any
type, previously archived. This record shall override the
linkname field in the following ustar header block(s).
path
The pathname of the following file(s). This record shall
override the name and prefix fields in the following header
block(s).
size
The size of the file in octets, expressed as a decimal number
using digits from the ISO/IEC 646:1991 standard. This record
shall override the size field in the following header
block(s).
GNU tar seems to try hard to ensure that a non-pax-aware tar can
extract *something* from a tar file, but it's not guaranteed that the
something contains the right data or is located at the right pathname.
It looks like the goal is to allow post-processing to pick up the
pieces.
In any case, this is all completely moot if we don't write code to
de-sparse a sparse entry: we will not be able to validate WAL data
if the WAL file is missing some pages. So I see little point in
having code that tolerates pax headers if it doesn't also do that.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-04-03 00:45:20 | Re: pgsql: Reduce log level of some logical decoding messages from LOG to D |
| Previous Message | Thomas Munro | 2026-04-03 00:07:40 | Re: pg_waldump: support decoding of WAL inside tarfile |