Re: pg_waldump: support decoding of WAL inside tarfile

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Amul Sul <sulamul(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-01-28 13:03:02
Message-ID: CA+Tgmoam3nNANQrVaN3vnZHji5t0KLmA94gcOWAEJsi3L0WyoA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 27, 2026 at 10:02 PM Euler Taveira <euler(at)eulerto(dot)com> wrote:
> + * archive_waldump.c
> + * A generic facility for reading WAL data from tar archives via archive
> + * streamer.
>
> The other tools (pg_basebackup and pg_verifybackup) that also use astreamer API
> named this similar file as astreamer_SOMETHING.c. It seems a good idea to
> follow the same pattern, no? Maybe astreamer_tar_archive.c or
> astreamer_archive.c.

There shouldn't be anything specific to tar files in here, and
astreamer_archive would be meaningless, since the "a" in "astreamer"
stands for archive. What this file is is an archive streamer specific
to pg_waldump, hence the name.

> Can it enforce a specific order? tar follows an arbitrary order in which the
> files is returned by the filesystem. You've been debating a solution to buffer
> the WAL contents using memory or spilled files. If it always create the tar in
> an alphabetical order, you can reduce the scope of this patch. (Didn't look
> what challenges are expected to use a sorted list to generate the tar file.)

It's posible to create a tar file in a specific order by specifying
command-line arguments to tar in the order you want the tar file to be
built. But I think the real thing here is that this limitation is
lifted by the following patch. Whether it's worth splitting it apart
into two patches this way is debatable. As I have pointed out in my
previous reviews, the split hasn't been done very cleanly.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chengpeng Yan 2026-01-28 13:08:36 Re: Hash-based MCV matching for large IN-lists
Previous Message Ahmed Et-tanany 2026-01-28 13:02:01 Re: [PATCH] Add max_logical_replication_slots GUC