From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
---|---|
To: | Amul Sul <sulamul(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
Date: | 2025-09-08 13:37:02 |
Message-ID: | CAKZiRmzzmaQupeHoASWVDwuoizTttDcBB8mjB7KNVhB6F6x2-Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 26, 2025 at 1:53 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
[..patch]
Hi Amul!
0001: LGTM, maybe I would just slightly enhance the commit message
("This is in preparation for adding a second source file to this
directory.") -- maye bit a bit more verbose or use a message from
0002?
0002: LGTM
0003: LGTM
Tested here (after partial patch apply, and test suite did work fine).
0004:
a. Why should it be necessary to provide startLSN (-s) ? Couldn't
it autodetect the first WAL (tar file) inside and just use that with
some info message?
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
pg_waldump: error: no start WAL location given
b. Why would it like to open "blah" dir if I wanted that "blah"
segment from the archive? Shouldn't it tell that it was looking in the
archive and couldn find it inside?
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar blah
pg_waldump: error: could not open file "blah": Not a directory
c. It doesnt work when using SEGSTART, but it's there:
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
000000010000000000000059
pg_waldump: error: could not open file "000000010000000000000059":
Not a directory
$ tar tf /tmp/base/pg_wal.tar | head -1
000000010000000000000059
d. I've later noticed that follow-up patches seem to use the
-s switch and there it seems to work OK. The above SEGSTART issue was
not detected, probably because tests need to be extended cover of
segment name rather than just --start LSN (see test_pg_waldump):
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar --stats
-s 0/59000358
pg_waldump: first record is after 0/59000358, at 0/590003E8,
skipping over 144 bytes
WAL statistics between 0/590003E8 and 0/61000000:
[..]
e. Code around`if (walpath == NULL && directory != NULL)` needs
some comments.
f. Code around `if (fname != NULL && is_tar_file(fname,
&compression))` , so if fname is WAL segment here
(00000001000000000000005A) and we do check again if that has been
tar-ed (is_tar_file())? Why?
g. Just a question: the commit message says `Note that this patch
requires that the WAL files within the archive be in sequential order;
an error will be reported otherwise`. I'm wondering if such
occurrences are known to be happening in the wild? Or is it just an
assumption that if someone would modify the tar somehow? (either way
we could just add a reason why we need to handle such a case if we
know -- is manual alternation the only source of such state?). For the
record, I've tested crafting custom archives with out of sequence WAL
archives and the code seems to work (it was done using: tar --append
-f pg_wal.tar --format=ustar ..)
h. Anyway, in case of typo/wrong LSN, 0004 emits wrong error
message I think:
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar --stats
-s 0/50000358
pg_waldump: error: WAL files are not archived in sequential order
pg_waldump: detail: Expecting segment number 80 but found 89.
it's just that the 50000358 LSN above is below the minimal LSN
present in the WAL segments (first segment is 000000010000000000000059
there, i've just intentionally provided a bad value 50.. as a typo and
it causes the wrong message). Now it might not be an issue as with
0005 patch the same test behaves OK (`pg_waldump: error: could not
find a valid record after 0/50000358`). It is just relevant if this
would be committed not all at once.
i. If I give wrong --timeline=999 to pg_waldump it fails with
misleading error message: could not read WAL data from "pg_wal.tar"
archive: read -1 of 8192
0005:
a. I'm wondering if we shouldn't log (to stderr?) some kind of
notification message (just once) that non-sequential WAL files were
discovered and that pg_waldump is starting to write to $somewhere as
it may be causing bigger I/O than anticipated when running the
command. This can easily help when troubleshooting why it is not fast,
and also having set TMPDIR to usually /tmp can be slow or too small.
b. IMHO member_prepare_tmp_write() / get_tmp_wal_file_path() with
TMPDIR can be prone to symlink attack. Consider setting TMPDIR=/tmp .
We are writing to e.g. /tmp/<WALsegment>.waldump.tmp in 0004 , but
that path is completely guessable. If an attacker prepares some
symlinks and links those to some other places, I think the code will
happily open and overwrite the contents of the rogue symlink. I think
using mkstemp(3)/tmpfile(3) would be a safer choice if TMPDIR needs to
be in play. Consider that pg_waldump can be run as root (there's no
mechanism preventing it from being used that way).
c. IMHO that unlink() might be not guaranteed to always remove
files, as in case of any trouble and exit() , those files might be
left over. I think we need some atexit() handlers. This can be
triggered with combo of options of nonsequential files in tar + wrong
LSN given:
$ tar tf pg_wal.tar
00000001000000000000005A
00000001000000000000005B
00000001000000000000005C
[..]
000000010000000000000060
000000010000000000000059 <-- out of order, appended last
$ ls -lh 0*
ls: cannot access '0*': No such file or directory
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/ble/pg_wal.tar --stats
-s 0/10000358 #wrong LSN
pg_waldump: error: could not find a valid record after 0/10000358
$ ls -lh 0*
-rw------- 1 postgres postgres 16M Sep 8 14:44
000000010000000000000059.waldump.tmp
-rw------- 1 postgres postgres 16M Sep 8 14:44
00000001000000000000005A.waldump.tmp
[..]
0006: LGTM
0007:
a. Commit message says `Future patches to pg_waldump will enable
it to decode WAL directly` , but those pg_waldump are earlier patches,
right?
b. pg_verifybackup should print some info with --progress that it
is spawning pg_waldump (pg_verifybackup --progress mode does not
display anything related to verifing WALs, but it could)
c. I'm wondering, but pg_waldump seems to be not complaining if
--end=LSN is made into such a future that it doesn't exist. E.g. If
the latest WAL segment is 60 (with end LSN 0/60A77A59), but I run
pg_waldump `--end=0/7000000` , it will return code 0 and nothing on
stderr. So how sure are we that the necessary WAL segments (as per
backup_manifest) are actually inside the tar? It's supposed to be
verified, but it isn't for this use case? Same happens if craft
special tar and remove just one WAL segment from pg_wal.tar (simulate
missing WAL segment), but ask the pg_verifybackup/pg_waldump to verify
it to exact last LSN sequence, e.g.:
$ /usr/pgsql19/bin/pg_waldump --quiet
--path=/tmp/missing/pg_wal.tar --timeline=1 --start=0/59000028
--end=0/60A77A58 && echo OK # but it is not OK
OK
$ /usr/pgsql19/bin/pg_waldump --stats
--path=/tmp/missing/pg_wal.tar --timeline=1 --start=0/59000028
--end=0/60A77A58
WAL statistics between 0/59000028 and 0/5CFFFFD0: # <-- 0/5C LSN
maximum detected
[..]
Notice it has read till 0/5C (but I've asked till 0/60), because
I've removed 0D:
$ tar tf /tmp/missing/pg_wal.tar| grep ^0
000000010000000000000059
00000001000000000000005A
00000001000000000000005B
00000001000000000000005C
00000001000000000000005E <-- missing 5D
Yet it reported no errors.
0008:
LGTM
Another open question I have is this: shouldn't backup_manifest come
with CRC checksum for the archived WALs? Or does that guarantee that
backup_manifest WAL-Ranges are present in pg_wal.tar is good enough
because individual WAL files are CRC-protected itself?
-J.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Treat | 2025-09-08 13:37:39 | Re: [PATCH] Proposal to Enable/Disable Index using ALTER INDEX |
Previous Message | Robert Haas | 2025-09-08 13:36:13 | Re: [PG19-3 PATCH] Don't ignore passfile |