Re: Adding REPACK [concurrently]

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-04-06 11:56:33
Message-ID: CAA4eK1+6GyTWSQ4XC8B=TJCkk=uFG1N+d7xg+Hp1N=uphVX0=Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 6, 2026 at 3:08 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2026-Apr-06, vignesh C wrote:
>
> > On Mon, 6 Apr 2026 at 02:12, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> > Few comments:
>
> Thanks for reviewing this patch!
>
> > 1) Can we add a comment why we should error out here, as repack
> > concurrently requires this whereas repack does not require this check.
> > Even if it is required for decoding can't it be handled by replica
> > identity full:
> > + /*
> > + * If the identity index is not set due to replica identity being, PK
> > + * might exist.
> > + */
> > + ident_idx = RelationGetReplicaIndex(rel);
> > + if (!OidIsValid(ident_idx) && OidIsValid(rel->rd_pkindex))
> > + ident_idx = rel->rd_pkindex;
> > + if (!OidIsValid(ident_idx))
> > + ereport(ERROR,
> > +
> > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > + errmsg("cannot process relation \"%s\"",
> > + RelationGetRelationName(rel)),
> > + errhint("Relation \"%s\" has no
> > identity index.",
> > + RelationGetRelationName(rel)));
>
> Ah, I was just rewriting that comment moments ago. I think we could
> make it work with replica identity full in theory, but I'm guessing it
> would be useless. You really need an index that lets you locate the
> affected tuples; otherwise the replay would take forever. If the table
> is big, that would make the whole thing unworkable. If the table isn't
> big, then you can probably just use straight repack without too much
> disruption.
>
> /*
> * Obtain the replica identity index -- either one that has been set
> * explicitly, or the primary key. If none of these cases apply, the
> * table cannot be repacked concurrently. It might be possible to have
> * repack work with a FULL replica identity; however that requires more
> * work and is not implemented yet.
> */
>
> Now, it's possible to have a non-unique index on some columns (not good
> enough to be replica identity) which gives you a list of candidate
> tuples, and then the implementation chooses one based on the other
> columns which are present in the FULL replica identity. (This seems a
> bit dangerous, because there might be multiple matching tuples; and how
> do you choose?) Now let's further suppose you can narrow down to a
> single tuple; in that case, using FULL would work.
>

I think this is already working for apply worker where we compare
tuples if the index is non_uniuqe, see FindReplTupleInLocalRel.

>
However, as I said,
> this requires more implementation effort. We could entertain a patch
> for this during the pg20 cycle, though I'm doubtful that it would really
> be worth your while.
>

It is fine to support it later in pg20. In general, I wonder if we
have evaluated whether some of the apply-worker infrastructure could
have been reused here?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2026-04-06 11:59:01 Re: DEREF_AFTER_NULL: src/common/jsonapi.c:2529
Previous Message Andrew Dunstan 2026-04-06 11:55:02 Re: pg_get__*_ddl consolidation