| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
| Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net> |
| Subject: | Re: Adding REPACK [concurrently] |
| Date: | 2026-04-21 07:24:25 |
| Message-ID: | 704E4371-FDA5-488F-B5E9-6B6F86A06669@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Apr 10, 2026, at 18:53, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Hi,
>
> When testing REPACK concurrently, I noticed that all WALs are retained from
> the moment REPACK begins copying data to the new table until the command
> finishes replaying concurrent changes on the new table and stops the repack
> decoding worker.
>
> I understand the reason: the REPACK command itself starts a long-running
> transaction, and logical decoding does not advance restart_lsn beyond the
> oldest running transaction's start position. As a result, slot.restart_lsn
> remains unchanged, preventing the checkpointer from recycling WALs.
>
> However, since REPACK can run for a long time (hours or even days), I'd like
> to confirm whether this is expected behavior or if we plan to improve it
> in the future ? And additionally, IIUC, REPACK without using concurrent option
> does not have this issue.
>
> Given that we do not restart a REPACK, I think the repack decoding worker
> should be able to advance restart_lsn each time after writing changes
> (similar to how a physical slot behaves). To illustrate this, I've written
> a patch (attached) that implements this approach, and it works fine for me.
>
> BTW, catalog_xmin also won't advance, but that seems not a big issue as
> the REPACK transaction itself also holds a snapshot that retains catalog tuples,
> so advancing catalog_xmin wouldn't change the situation anyway.
>
> Thoughts ?
>
> Best Regards,
> Hou zj
> <v1-0001-Allow-old-WALs-to-be-removed-during-REPACK-CONCUR.patch>
I found the same problem with LogicalConfirmReceivedLocation and posted a fix in a separate thread [1]. So I would withdraw my patch.
Looking at this patch, the change is exactly the same as what I did in [1], but I think the code comment should be updated as well. For the comment change, please see my patch in [1].
[1] https://www.postgresql.org/message-id/D8D9F770-DAA2-482C-A7E0-F87E5104C13E%40gmail.com
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrei Lepikhov | 2026-04-21 07:29:01 | Re: A very quick observation of dangling pointers in Postgres pathlists |
| Previous Message | Heikki Linnakangas | 2026-04-21 07:20:46 | Re: Compress prune/freeze records with Delta Frame of Reference algorithm |