Re: Adding REPACK [concurrently]

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-05-13 16:58:14
Message-ID: agSqVs6DgQZxYUmW@alvherre.pgsql
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Amit,

On 2026-May-13, Amit Kapila wrote:

> So now the question is where do we go from here. I am not confident
> that the current code to achieve db-specific snapshots in logical
> decoding is the best possible solution both because of the drawbacks
> (like we won't be able to enable this on standby) and inefficiencies
> pointed out by me in this and previous emails in this work.

This is a fair question. I don't think we have time to go much further
on this aspect before beta 1, so we either accept this patch, fix the
inefficiencies you pointed out and keep db-specific snapshots, or we
revert db-specific snapshots and go back to the standard snapshot-taking
technique for REPACK in 19 and see what we can improve for 20.

Now, the worst consequence of reverting db-specific snapshots is that
you will only be able to run REPACK in a single database at a time
(because any subsequent REPACK will have to wait until the first one
finishes before being able to get its snapshot). In most normal cases
this is probably not a big deal. But if you have a multitenant system,
and you want your users to be able to run REPACK on their tables, you
may be a bit screwed. So I hesitate to just go and revert it without
offering those people any alternative.

(It's also possible that being unable to run more than one REPACK at a
time is not so big a deal. After all, it's supposed to be an infrequent
operation. And users probably don't or shouldn't have multi-terabyte
tables in multitenant databases anyway.)

I'm not sure I understand the point of the standby. I mean, you can't
run REPACK on the standby anyway, so I don't see this as a very
problematic restriction. Do you have other reasons for wanting a
db-specific snapshot in a standby?

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2026-05-13 17:16:45 Re: [PATCH] Add pg_current_vxact_id() function to expose virtual transaction IDs
Previous Message Henson Choi 2026-05-13 16:32:25 Re: [SQL/PGQ] Early pruning for GRAPH_TABLE path generation