Re: pg_waldump: support decoding of WAL inside tarfile

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-03-21 03:49:02
Message-ID: 2178517.1774064942@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Thanks, committed with very minor tweaks.

Buildfarm members batta and hachi don't like this very much.
They fail the pg_verifybackup tests like so:

# Running: pg_verifybackup --exit-on-error /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup
pg_waldump: error: could not find WAL in archive "base.tar.zst"
pg_verifybackup: error: WAL parsing failed for timeline 1

Only the zstd-compression case fails. I've spent several hours trying
to reproduce this, without any luck, although I can get a similar
failure in only the gzip case if I build with --with-wal-blocksize=64.
I do not have an explanation for the seeming cross-platform
difference. However after adding a lot of debug tracing, I believe
I see the bug, or at least a related bug. This bit in
archive_waldump.c's init_archive_reader is where the error comes from:

/*
* Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
* the first WAL segment in the archive so we can extract the WAL segment
* size from the long page header.
*/
while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
{
if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
pg_fatal("could not find WAL in archive \"%s\"",
privateInfo->archive_name);

entry = privateInfo->cur_file;
}

That looks plausible but is in fact utterly broken when there's not a
lot of WAL data in the archive, as there is not in this test case.
There are at least two problems:

1. read_archive_file reads some data from the source WAL archive and
shoves it into the astreamer decompression pipeline. However, once it
runs out of source data, it just returns zero and we fail immediately.
This does not account for the possibility --- nay, certainty --- that
there is data queued inside the decompression pipeline. So this
doesn't work if the data we need has been compressed into less than
XLOG_BLCKSZ worth of compressed data. (I suppose that the seeming
cross-platform differences have to do with the effectiveness of the
compression algorithm, but I don't really understand why it'd not be
the same everywhere.) We need to do astreamer_finalize once we run
out of source data. I think the cleanest place to handle that would
be inside read_archive_file, but its return convention will need some
rework if we want to put it there (because rc == 0 shouldn't cause an
immediate failure if we were able to finalize some more data). As an
ugly experiment I put an astreamer_finalize call into the rc == 0 path
of the above loop, but it still didn't work, because:

2. If the decompression pipeline reaches the end of the WAL file that
we want, the ASTREAMER_MEMBER_TRAILER case in
astreamer_waldump_content instantly resets privateInfo->cur_file to
NULL. Then the loop in init_archive_reader cannot exit successfully,
and it will just read till the end of the archive and fail.

I see that of the three callers of read_archive_file, only
get_archive_wal_entry is aware of this possibility; but
init_archive_reader certainly needs to deal with it and I bet
read_archive_wal_page does too. Moreover, get_archive_wal_entry's
solution looks to me like a fragile kluge that probably doesn't work
reliably either, the reason being that privateInfo->cur_file can
change multiple times during a single call to read_archive_file,
if the WAL data has been compressed sufficiently. That whole API
seems to need some rethinking, not to mention better documentation
than the zero it has now.

While I'm bitching: this error message "could not find WAL in archive
\"%s\"" seems to me to be completely misleading and off-point.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2026-03-21 05:02:32 Re: Row pattern recognition
Previous Message Richard Guo 2026-03-21 02:47:03 Remove inner joins based on foreign keys