| From: | Amul Sul <sulamul(at)gmail(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
| Date: | 2026-03-02 13:00:24 |
| Message-ID: | CAAJ_b96csGv+TvdxK-zp1Zo6zrAxOJ4n-qnxiWO1f0Lk0X0N_g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Feb 18, 2026 at 12:28 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> >
> > On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > > > > In the attached version, I am using the WAL segment name as the hash
> > > > > key, which is much more straightforward. I have rewritten
> > > > > read_archive_wal_page(), and it looks much cleaner than before. The
> > > > > logic to discard irrelevant WAL files is still within
> > > > > get_archive_wal_entry. I added an explanation for setting cur_wal to
> > > > > NULL, which is now handled in the separate function I mentioned
> > > > > previously.
> > > > >
> > > > > Kindly have a look at the attached version; let me know if you are
> > > > > still not happy with the current approach for filtering/discarding
> > > > > irrelevant WAL segments. It isn't much different from the previous
> > > > > version, but I have tried to keep it in a separate routine for better
> > > > > code readability, with comments to make it easier to understand. I
> > > > > also added a comment for ArchivedWALFile.
> > > >
> > > > I feel like the division of labor between get_archive_wal_entry() and
> > > > read_archive_wal_page() is odd. I noticed this in the last version,
> > > > too, and it still seems to be the case. get_archive_wal_entry() first
> > > > calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> > > > If it doesn't, it loops until an entry for the requested file shows up
> > > > and then returns it. Then control returns to read_archive_wal_page()
> > > > which loops some more until we have all the data we need for the
> > > > requested file. But it seems odd to me to have two separate loops
> > > > here. I think that the first loop is going to call read_archive_file()
> > > > until we find the beginning of the file that we care about and then
> > > > the second one is going to call read_archive_file() some more until we
> > > > have read enough of it to satisfy the request. It feels odd to me to
> > > > do it that way, as if we told somebody to first wait until 9 o'clock
> > > > and then wait another 30 minutes, instead of just telling them to wait
> > > > until 9:30. I realize it's not quite the same thing, because apart
> > > > from calling read_archive_file(), the two loops do different things,
> > > > but I still think it looks odd.
> > > >
> > > > + /*
> > > > + * Ignore if the timeline is different or the current segment is not
> > > > + * the desired one.
> > > > + */
> > > > + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> > > > + if (privateInfo->timeline != curSegTimeline ||
> > > > + privateInfo->startSegNo > curSegNo ||
> > > > + privateInfo->endSegNo < curSegNo ||
> > > > + segno > curSegNo)
> > > > + {
> > > > + free_archive_wal_entry(entry->fname, privateInfo);
> > > > + continue;
> > > > + }
> > > >
> > > > The comment doesn't match the code. If it did, the test would be
> > > > (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> > > > instead the segno test is > rather than !=, and the checks against
> > > > startSegNo and endSegNo aren't explained at all. I think I understand
> > > > why the segno test uses > rather than !=, but it's the point of the
> > > > comment to explain things like that, rather than leaving the reader to
> > > > guess. And I don't know why we also need to test startSegNo and
> > > > endSegNo.
> > > >
> > > > I also wonder what the point is of doing XLogFromFileName() on the
> > > > fname provided by the caller and then again on entry->fname. Couldn't
> > > > you just compare the strings?
> > > >
> > > > Again, the division of labor is really odd here. It's the job of
> > > > astreamer_waldump_content() to skip things that aren't WAL files at
> > > > all, but it's the job of get_archive_wal_entry() to skip things that
> > > > are WAL files but not the one we want. I disagree with putting those
> > > > checks in completely separate parts of the code.
> > > >
> > >
> > > Keeping the timeline and segment start-end range checks inside the
> > > archive streamer creates a circular dependency that cannot be resolved
> > > without a 'dirty hack'. We must read the first available WAL file page
> > > to determine the wal_segment_size before it can calculate the target
> > > segment range. Moving the checks inside the streamer would make it
> > > impossible to process that initial file, as the necessary filtering
> > > parameters -- would still be unknown which would need to be skipped
> > > for the first read somehow. What if later we realized that the first
> > > WAL file which was allowed to be streamed by skipping that check is
> > > irrelevant and doesn't fall under the start-end segment range?
> > >
> >
> > Please have a look at the attached version, specifically patch 0005.
> > In astreamer_waldump_content(), I have moved the WAL file filtration
> > check from get_archive_wal_entry(). This check will be skipped during
> > the initial read in init_archive_reader(), which instead performs it
> > explicitly once it determines the WAL segment size and the start/end
> > segments.
> >
> > To access the WAL segment size inside astreamer_waldump_content(), I
> > have moved the WAL segment size variable into the XLogDumpPrivate
> > structure in the separate 0004 patch.
>
> Attached is an updated version including the aforesaid changes. It
> includes a new refactoring patch (0001) that moves the logic for
> identifying tar archives and their compression types from
> pg_basebackup and pg_verifybackup into a separate-reusable function,
> per a suggestion from Euler [1]. Additionally, I have added a test
> for the contrecord decoding to the main patch (now 0006).
>
> 1] http://postgr.es/m/3c8e7b02-2152-495a-a0b6-e37cf9286a70@app.fastmail.com
>
Rebased against the latest master, fixed typos in code comments, and
replaced palloc0 with palloc0_object.
Regards,
Amul
| Attachment | Content-Type | Size |
|---|---|---|
| v14-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch | application/octet-stream | 6.7 KB |
| v14-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch | application/octet-stream | 2.2 KB |
| v14-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch | application/octet-stream | 2.4 KB |
| v14-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch | application/octet-stream | 6.6 KB |
| v14-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch | application/octet-stream | 5.1 KB |
| v14-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch | application/octet-stream | 41.7 KB |
| v14-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch | application/octet-stream | 13.1 KB |
| v14-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch | application/octet-stream | 1.7 KB |
| v14-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch | application/octet-stream | 5.9 KB |
| v14-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch | application/octet-stream | 9.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | VASUKI M | 2026-03-02 13:02:55 | Re: [PATCH] psql: tab completion for ALTER ROLE ... IN DATABASE ... |
| Previous Message | David Geier | 2026-03-02 12:50:37 | Re: Convert NOT IN sublinks to anti-joins when safe |