| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
| Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
| Date: | 2026-03-22 23:13:11 |
| Message-ID: | 3341199.1774221191@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Everything's pushed, although it looks like it'll be a few hours
before we see results from batta or hachi.
In the meantime, I have one more change that I'd like to propose in
archive_waldump.c: I think we should drop read_archive_file's "count"
parameter and just always try to fill archive_read_buf completely,
as attached. I don't believe that using a variable count is buying
anything. I get that the idea was to not load more data than we have
to before init_archive_reader can identify the segment size and start
filtering data we don't need. But I doubt that that helps a whole
lot: I think most of the cost in this pipeline is going into
decompression and tar format parsing, not into copying data into the
hashtable. Also, I noticed while I was investigating the bugs we
just fixed that the tar file very likely contains a ton of other
data before the first WAL file. In the regression test scenarios
it seems to reliably have all of the cluster's actual table data
ahead of the WAL directory. By using a small read count, we are
slowing down scanning of all that data. Maybe not by much, I've
not tried to measure it. But I think the case for using a variable
count is very weak.
Even if you don't like that change, I think we should apply the
parts of the attached that change archive_read_buf_size into an
always-present, always-filled field of XLogDumpPrivate. Having
its presence be dependent on USE_ASSERT_CHECKING is just asking
for trouble, ie different .c files having a different opinion
of whether it's there.
regards, tom lane
| Attachment | Content-Type | Size |
|---|---|---|
| v1-remove-read_archive_file-count-parameter.patch | text/x-diff | 4.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew Jackson | 2026-03-22 23:38:41 | Re: Add ldapservice connection parameter |
| Previous Message | Tom Lane | 2026-03-22 21:39:26 | Re: pg_waldump: support decoding of WAL inside tarfile |