| From: | Amul Sul <sulamul(at)gmail(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
| Date: | 2026-02-18 06:58:27 |
| Message-ID: | CAAJ_b979p7Z-dBZhxBC3m2VPkYTvYzypMTWkYg2q-e1S5F_f-Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> >
> > On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > > > In the attached version, I am using the WAL segment name as the hash
> > > > key, which is much more straightforward. I have rewritten
> > > > read_archive_wal_page(), and it looks much cleaner than before. The
> > > > logic to discard irrelevant WAL files is still within
> > > > get_archive_wal_entry. I added an explanation for setting cur_wal to
> > > > NULL, which is now handled in the separate function I mentioned
> > > > previously.
> > > >
> > > > Kindly have a look at the attached version; let me know if you are
> > > > still not happy with the current approach for filtering/discarding
> > > > irrelevant WAL segments. It isn't much different from the previous
> > > > version, but I have tried to keep it in a separate routine for better
> > > > code readability, with comments to make it easier to understand. I
> > > > also added a comment for ArchivedWALFile.
> > >
> > > I feel like the division of labor between get_archive_wal_entry() and
> > > read_archive_wal_page() is odd. I noticed this in the last version,
> > > too, and it still seems to be the case. get_archive_wal_entry() first
> > > calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> > > If it doesn't, it loops until an entry for the requested file shows up
> > > and then returns it. Then control returns to read_archive_wal_page()
> > > which loops some more until we have all the data we need for the
> > > requested file. But it seems odd to me to have two separate loops
> > > here. I think that the first loop is going to call read_archive_file()
> > > until we find the beginning of the file that we care about and then
> > > the second one is going to call read_archive_file() some more until we
> > > have read enough of it to satisfy the request. It feels odd to me to
> > > do it that way, as if we told somebody to first wait until 9 o'clock
> > > and then wait another 30 minutes, instead of just telling them to wait
> > > until 9:30. I realize it's not quite the same thing, because apart
> > > from calling read_archive_file(), the two loops do different things,
> > > but I still think it looks odd.
> > >
> > > + /*
> > > + * Ignore if the timeline is different or the current segment is not
> > > + * the desired one.
> > > + */
> > > + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> > > + if (privateInfo->timeline != curSegTimeline ||
> > > + privateInfo->startSegNo > curSegNo ||
> > > + privateInfo->endSegNo < curSegNo ||
> > > + segno > curSegNo)
> > > + {
> > > + free_archive_wal_entry(entry->fname, privateInfo);
> > > + continue;
> > > + }
> > >
> > > The comment doesn't match the code. If it did, the test would be
> > > (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> > > instead the segno test is > rather than !=, and the checks against
> > > startSegNo and endSegNo aren't explained at all. I think I understand
> > > why the segno test uses > rather than !=, but it's the point of the
> > > comment to explain things like that, rather than leaving the reader to
> > > guess. And I don't know why we also need to test startSegNo and
> > > endSegNo.
> > >
> > > I also wonder what the point is of doing XLogFromFileName() on the
> > > fname provided by the caller and then again on entry->fname. Couldn't
> > > you just compare the strings?
> > >
> > > Again, the division of labor is really odd here. It's the job of
> > > astreamer_waldump_content() to skip things that aren't WAL files at
> > > all, but it's the job of get_archive_wal_entry() to skip things that
> > > are WAL files but not the one we want. I disagree with putting those
> > > checks in completely separate parts of the code.
> > >
> >
> > Keeping the timeline and segment start-end range checks inside the
> > archive streamer creates a circular dependency that cannot be resolved
> > without a 'dirty hack'. We must read the first available WAL file page
> > to determine the wal_segment_size before it can calculate the target
> > segment range. Moving the checks inside the streamer would make it
> > impossible to process that initial file, as the necessary filtering
> > parameters -- would still be unknown which would need to be skipped
> > for the first read somehow. What if later we realized that the first
> > WAL file which was allowed to be streamed by skipping that check is
> > irrelevant and doesn't fall under the start-end segment range?
> >
>
> Please have a look at the attached version, specifically patch 0005.
> In astreamer_waldump_content(), I have moved the WAL file filtration
> check from get_archive_wal_entry(). This check will be skipped during
> the initial read in init_archive_reader(), which instead performs it
> explicitly once it determines the WAL segment size and the start/end
> segments.
>
> To access the WAL segment size inside astreamer_waldump_content(), I
> have moved the WAL segment size variable into the XLogDumpPrivate
> structure in the separate 0004 patch.
Attached is an updated version including the aforesaid changes. It
includes a new refactoring patch (0001) that moves the logic for
identifying tar archives and their compression types from
pg_basebackup and pg_verifybackup into a separate-reusable function,
per a suggestion from Euler [1]. Additionally, I have added a test
for the contrecord decoding to the main patch (now 0006).
1] http://postgr.es/m/3c8e7b02-2152-495a-a0b6-e37cf9286a70@app.fastmail.com
Regards,
Amul
| Attachment | Content-Type | Size |
|---|---|---|
| v13-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch | application/x-patch | 6.7 KB |
| v13-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch | application/x-patch | 2.2 KB |
| v13-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch | application/x-patch | 2.4 KB |
| v13-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch | application/x-patch | 6.6 KB |
| v13-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch | application/x-patch | 5.1 KB |
| v13-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch | application/x-patch | 41.7 KB |
| v13-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch | application/x-patch | 13.1 KB |
| v13-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch | application/x-patch | 1.7 KB |
| v13-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch | application/x-patch | 5.9 KB |
| v13-0010-pg_verifybackup-enabled-WAL-parsing-for-tar-form.patch | application/x-patch | 9.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2026-02-18 07:06:31 | Re: [PATCH] Support automatic sequence replication |
| Previous Message | shveta malik | 2026-02-18 06:57:55 | Re: Skipping schema changes in publication |