Re: pg_waldump: support decoding of WAL inside tarfile

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-03-21 18:23:55
Message-ID: 2360498.1774117435@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have made some progress on the question of how to reproduce
these failures. If I do this:

diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..c389a227be5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -178,7 +178,7 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
*/
while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
{
- if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
pg_fatal("could not find WAL in archive \"%s\"",
privateInfo->archive_name);

then I get the "could not find WAL in archive" failures in
pg_verifybackup's gzip and lz4 tests, but not zstd. This happens
reproducibly even without any special hacks on XLOG_BLCKSZ or
wal_compression settings. Of course, this is not exactly what's
happening on batta/hachi, because they fail on zstd and not the
other two. But I think it confirms my theory that the problem
is essentially poor handling of EOF boundary conditions.

(Per discussion, there are other bugs here too; I don't mean
to minimize that aspect.)

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-03-21 18:36:31 Re: Add RISC-V Zbb popcount optimization
Previous Message Amit Kapila 2026-03-21 17:46:36 Re: Fix slotsync worker busy loop causing repeated log messages