Re: pg_waldump: support decoding of WAL inside tarfile

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: 2026-03-25 17:25:21
Message-ID: 374225.1774459521@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> The buildfarm has switched mostly to green, except on this one:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoatzin&dt=2026-03-23%2006%3A00%3A42

Yeah, there are some other weird failures on other machines too.
There's also the problem we already knew about of FD leakage breaking
cleanup of the temp file directory on Windows.

I wrote a patch to fix the FD leakage problem (v8-0001 attached).
I don't have Windows at hand, but I tested it by dint of having
the atexit callback invoke "lsof" to see if there were any open
files in the temp directory.

I also occasionally saw some of the weird errors mentioned above.
After much debugging, I believe that the issue is that
archive_waldump.c is unaware that inserting or deleting entries
in a simplehash.h hash table can cause other entries to move.
That can break privateInfo->cur_file, and it can also break
read_archive_wal_page which thought it could just re-use its
entry pointer after calling read_archive_file. v8-0002 attached
fixes that, and I'm not seeing weird failures anymore.

Two additional thoughts:

1. The amount of data that pg_waldump's TAP tests use is not
sufficient to trigger these problems with any degree of reliability.
I'm hesitant to make the tests run longer, but clearly we do not
have adequate coverage now.

2. I didn't do it here, but I urgently think we should rip out
read_archive_wal_page's stanza that truncates the entry's
"buf" string (the "if (privateInfo->decoding_started)" part).
My faith in this code in general is at rock bottom, and my faith in
the extent to which we've tested it is somewhere below ground level.
I don't think we need rickety optimizations that serve only to
keep the active hashtable entry to less than 16MB, when we're going
to reclaim that space altogether as soon as we've finished dumping
that segment. This truncation scares me because it adds a whole
'nother level of poorly-documented complexity to the invariants
around what is in entry->buf. Also, while we theoretically should
not need to spill the entry after this point, if we did we would
write a corrupted spill file.

regards, tom lane

Attachment Content-Type Size
v8-0001-Fix-file-descriptor-leakages-in-pg_waldump.patch text/x-diff 5.8 KB
v8-0002-Fix-misuse-of-simplehash.h-hash-operations-in-pg_.patch text/x-diff 3.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2026-03-25 17:26:54 Re: SLOPE - Planner optimizations on monotonic expressions.
Previous Message Nathan Bossart 2026-03-25 17:16:05 Re: Expanding HOT updates for expression and partial indexes