From: | Amul Sul <sulamul(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pg_waldump: support decoding of WAL inside tarfile |
Date: | 2025-09-29 16:17:10 |
Message-ID: | CAAJ_b94gK1np8d1h-2c1YoCccGXr4zspTa-FC7X_bfXZNz=-DA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 29, 2025 at 8:45 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Sep 25, 2025 at 4:25 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > > Another thing that isn't so nice right now is that
> > > verify_tar_archive() has to open and close the archive only for
> > > init_tar_archive_reader() to be called to reopen it again just moments
> > > later. It would be nicer to open the file just once and then keep it
> > > open. Here again, I wonder if the separation of duties could be a bit
> > > cleaner.
> >
> > Prefer to keep those separate, assuming that reopening the file won't
> > cause any significant harm. Let me know if you think otherwise.
>
> Well, I guess I'd like to know why we can't do better. I'm not really
> worried about performance, but reopening the file means that you can
> never make it work with reading from a pipe.
I have some skepticism regarding the extra coding that might be
introduced, as performance is not my primary concern here. If we aim
to keep the file open only once, that logic should be implemented
before calling verify_tar_archive(), not inside it. Implementing the
open and close logic within verify_tar_archive() and
free_tar_archive_reader() would create a confusing and scattered
pattern, especially since these separate operations require only two
lines of code each (open and close if it's a tar file). My second,
concern is that after verify_tar_archive(), we might need to reset the
file reader offset to the beginning. While reusing the buffered data
from the first iteration is technically possible, that only works if
the desired start LSN is at the absolute beginning of the archive, or
later in the sequence, which cannot be reliably guaranteed. Therefore,
for simplicity and avoid the complexity of managing that offset reset
code, I am thinking of a simpler approach.
Regards,
Amul
From | Date | Subject | |
---|---|---|---|
Next Message | Arseniy Mukhin | 2025-09-29 16:34:47 | Re: Recovering from detoast-related catcache invalidations |
Previous Message | Tom Lane | 2025-09-29 15:53:58 | Re: test_json_parser/002_inline is kind of slow |