| From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
|---|---|
| To: | Amul Sul <sulamul(at)gmail(dot)com> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
| Date: | 2025-11-19 08:20:14 |
| Message-ID: | CAKZiRmyDk5KqovS9Ez3iFHd+p-TChSt2QTtWkwJ5Ya-+4gg21g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Nov 17, 2025 at 5:51 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
> On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> >
> > On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > > [....]
> > Kindly have a look at the attached version. Thank you !
> >
>
> Attached is the rebased version against the latest master head (e76defbcf09).
Hi Amul, thanks for working on this. I haven't really looked at the
source code deeply (I trust Robert eyes much more than mine on this
one), just skimmed a little bit:
1. As stated earlier, get_tmp_walseg_path() is still vulnerable (it
uses predictable path that could be used by attacker in $TMPDIR)
2. On the usability front:
a. If you do `pg_waldump --path pg_wal.tar -s 0/31000000` it will dump
a lot of WAL records and then print final:
pg_waldump: error: could not find file "000000010000000000000034" in archive
However, with `pg_waldump --path pg_wal.tar -s 0/31000000
--stats=record` (not passing '-e') it will simply bailout without
printing stats and with error:
pg_waldump: error: could not find file "000000010000000000000034" in archive
IMHO, it could print stats if it was capable of getting at least 1 WAL record.
3. The most critical issue for me was the initial lack of error
pass-through from pg_waldump (when used with WALs in tar) to the
pg_verifybackup. Now it works fine, so thanks for this:
a. pg_waldump is capable of discovering missing WALs as requested and
throwing proper return code (good)
$ /usr/pgsql19/bin/pg_waldump --path pg_wal.tar -s 0/31005F70 -e 0/343D2650 -q
pg_waldump: error: could not find file "000000010000000000000034" in archive
$ echo $?
1
$
b. pg_verifybackup now also complains properly with missing WAL inside tar
$ tar --delete -f pg_wal.tar 000000010000000000000032 # simulate loss of file
$ tar -tf pg_wal.tar
000000010000000000000031
archive_status/000000010000000000000031.done
archive_status/000000010000000000000032.done
000000010000000000000033
$ grep Start-LSN backup_manifest
{ "Timeline": 1, "Start-LSN": "0/31005F70", "End-LSN": "0/333D2650" }
$ /usr/pgsql19/bin/pg_verifybackup -P /tmp/basebackup/
791372/791372 kB (100%) verified
pg_waldump: error: could not find file "000000010000000000000032" in archive
pg_verifybackup: error: WAL parsing failed for timeline 1
$ echo $?
1
$
-J.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zhijie Hou (Fujitsu) | 2025-11-19 08:23:56 | RE: Newly created replication slot may be invalidated by checkpoint |
| Previous Message | Peter Eisentraut | 2025-11-19 08:04:04 | Re: GUC thread-safety approaches |