Re: [PATCH] Preserve replication origin OIDs in pg_upgrade

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shvetamalik(at)gmail(dot)com>
Subject: Re: [PATCH] Preserve replication origin OIDs in pg_upgrade
Date: 2026-06-19 07:48:34
Message-ID: CANhcyEUhTTzFktkyxpsLeq1dQWVRNNN-=ZS6HC_Jnb1srj4-BQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 17 Jun 2026 at 14:49, Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> On Fri, Jun 5, 2026 at 8:10 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > Few comments:
> > 1) pg_dumpall dumps all replication origins, not just subscription
> > origins and recreates every origin:
> > SELECT o.*, os.remote_lsn
> > FROM pg_replication_origin o
> > LEFT JOIN pg_replication_origin_status os
> > ON o.roident = os.local_id
> >
> > Is pg_upgrade expected to preserve user-created origins?
> > If so, you can mention this in the commit message
> >
>
> I explicitly state that all replication origins are migrated in the
> commit message, nothing specific about subscription origins.
>
> > 2) Should the table lock be released here, if so should we mention
> > comments similar to how it is mentioned in
> > binary_upgarde_add_sub_rel_state:
> > + /* Restore the remote_lsn if provided, while still holding the lock */
> > + if (!PG_ARGISNULL(2))
> > + {
> > + XLogRecPtr remote_commit = PG_GETARG_LSN(2);
> >
> > - ReplicationSlotRelease();
> > + replorigin_advance(node, remote_commit, InvalidXLogRecPtr,
> > + false /* backward */,
> > + false /* WAL log */);
> > + }
> > +
> > + table_close(rel, RowExclusiveLock);
> >
>
> Is that relevant when creating a new row? or only when updating an existing one?
>
> > 3) We can use new style of ereport to exclude '(' before errcode:
> > 3.a)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> > + errmsg("replication origin name is too long"),
> > + errdetail("Replication origin names must be no longer than %d bytes.",
> > + MAX_RONAME_LEN)));
> >
> > 3.b) Similarly here:
> > + ereport(ERROR,
> > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > + errmsg("replication origin with ID %u already exists", node_oid)));
> >
> > 3.c) There are many others, you can search other ereport in the patch
> > and change it too.
> >
>
> Done.
>
> > 4) Since only one column will be returned here, can we use
> > "PQgetvalue(res, 0, 0) and remove PQfnumber:
> > + res = executeQueryOrDie(conn,
> > + "SELECT count(*) AS nrepl_origins "
> > + "FROM pg_catalog.pg_replication_origin");
> > + i_nrepl_origins = PQfnumber(res, "nrepl_origins");
> > + cluster->nrepl_origins = atoi(PQgetvalue(res, 0, i_nrepl_origins));
> > + PQclear(res);
> >
>
> Done.
>
> > 5) replorigin_create also has some duplicate code like below, will it
> > be possible to have a common function so that both of them can use?
> > /* Check for name collision */
> > + ScanKeyInit(&key,
> > + Anum_pg_replication_origin_roname,
> > + BTEqualStrategyNumber, F_TEXTEQ,
> > + roname_d);
> > + scan = systable_beginscan(rel, ReplicationOriginNameIndex,
> > + true /* indexOK */,
> > + SnapshotSelf,
> > + 1, &key);
> > + collides = HeapTupleIsValid(systable_getnext(scan));
> > + systable_endscan(scan);
> > +
> > + if (collides)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > + errmsg("replication origin \"%s\" already exists",
> > + originname)));
> > +
> > + memset(&nulls, 0, sizeof(nulls));
> > + memset(&values, 0, sizeof(values));
> > +
> > + values[Anum_pg_replication_origin_roident - 1] = ObjectIdGetDatum(node);
> > + values[Anum_pg_replication_origin_roname - 1] = roname_d;
> > +
> > + tuple = heap_form_tuple(RelationGetDescr(rel), values, nulls);
> > + CatalogTupleInsert(rel, tuple);
> > + heap_freetuple(tuple);
> > + CommandCounterIncrement();
> >
>
> I have created a new function replorigin_create_with_id() which has
> some common code and called that from both replorigin_create() and
> binary_upgrade_create_replication_origin()
>
>
> On Tue, Jun 9, 2026 at 3:06 AM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> > Hi Ajin,
> > I have reviewed the patch and here are some comments:
> >
> > 1. I want to clarify, what is the expected behaviour when we upgrade
> > the postgres instance from version <= 16 to "HEAD + patch"?
> > - ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
> > sizeof(originname));
> > - replorigin_create(originname);
> > + if (!IsBinaryUpgrade)
> > + {
> > + ReplicationOriginNameForLogicalRep(subid, InvalidOid,
> > originname, sizeof(originname));
> > + replorigin_create(originname);
> > + }
> > and
> > + /* Dump replication origins */
> > + if (server_version >= 170000 && binary_upgrade &&
> > archDumpFormat == archNull)
> > + dumpReplicationOrigins(conn);
> >
> > When the postgres instance with subscription is updated from PG 16 (or
> > less) to HEAD, the replication origin is created as we do not have "if
> > (!IsBinaryUpgrade)" check in HEAD,
> > whereas when the similar instance is upgraded to "HEAD + patch", no
> > replication origin is present after the upgrade.
> > Is this difference in the behaviour between HEAD and "HEAD + patch" expected?
> >
>
> I have modified this as option 3 suggested by Shveta and Amit and
> removed the PG17 restriction for migrating replication origins, now it
> does it for all versions that have replication origins.
>
> > 2.a) Can we replace:
> > + /* Check for OID collision */
> > + ScanKeyInit(&key,
> > + Anum_pg_replication_origin_roident,
> > + BTEqualStrategyNumber, F_OIDEQ,
> > + ObjectIdGetDatum(node));
> > + scan = systable_beginscan(rel, ReplicationOriginIdentIndex,
> > + true /* indexOK */,
> > + SnapshotSelf,
> > + 1, &key);
> > + collides = HeapTupleIsValid(systable_getnext(scan));
> > + systable_endscan(scan);
> > +
> > + if (collides)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > + errmsg("replication origin with ID %u already
> > exists", node_oid)));
> >
> > with:
> > if (SearchSysCacheExists1(REPLORIGIDENT, ObjectIdGetDatum(node)))
> > ereport(ERROR, errcode(ERRCODE_DUPLICATE_OBJECT),
> > errmsg("replication origin with ID %u already
> > exists", node_oid));
> >
> > 2.b) Similarly, Can we replace:
> > + /* Check for name collision */
> > + ScanKeyInit(&key,
> > + Anum_pg_replication_origin_roname,
> > + BTEqualStrategyNumber, F_TEXTEQ,
> > + roname_d);
> > + scan = systable_beginscan(rel, ReplicationOriginNameIndex,
> > + true /* indexOK */,
> > + SnapshotSelf,
> > + 1, &key);
> > + collides = HeapTupleIsValid(systable_getnext(scan));
> > + systable_endscan(scan);
> > +
> > + if (collides)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > + errmsg("replication origin \"%s\" already exists",
> > + originname)));
> > with:
> > if (SearchSysCacheExists1(REPLORIGNAME, roname_d))
> > ereport(ERROR, errcode(ERRCODE_DUPLICATE_OBJECT),
> > errmsg("replication origin \"%s\" already
> > exists", originname));
> >
> > This will make the code smaller and easy to read. Thoughts?
>
> Done
>
>
> On Tue, Jun 9, 2026 at 2:13 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> >
> > Please find a few comments on v007:
> >
> > 1)
> >
> > + node = (ReplOriginId) node_oid;
> > + originname = text_to_cstring(PG_GETARG_TEXT_PP(1));
> > +
> > + if (strlen(originname) > MAX_RONAME_LEN)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> > + errmsg("replication origin name is too long"),
> > + errdetail("Replication origin names must be no longer than %d bytes.",
> > + MAX_RONAME_LEN)));
> > +
> > + roname_d = CStringGetTextDatum(originname);
> >
> > CStringGetTextDatum is:
> > #define CStringGetTextDatum(s) PointerGetDatum(cstring_to_text(s))
> > ---------
> >
> > We are first converting text to C-string , then Cstring to text and
> > then getting Datum. Double allocation and double conversion. Can we
> > please try this instead and verify if it works:
> >
> > text *origin_text = PG_GETARG_TEXT_PP(1);
> > originname = text_to_cstring(origin_text);
> > roname_d = PointerGetDatum(origin_text);
> >
>
> I've restructured the code now, its a bit tough to do this now.
>
> > 2)
> > +check_new_cluster_replication_origins(void)
> > +{
> > + PGconn *conn;
> > + PGresult *res;
> > + int norigins;
> > +
> > + /* Quick return if there are no replication origins to migrate. */
> > + if (old_cluster.nrepl_origins == 0)
> > + return;
> >
> > Don't we need a version check simialr to what we have introduced in
> > pg_dumpall for origins? (Similar to
> > check_new_cluster_replication_slots as well)
> >
>
> I've removed the version check for PG17 now.
>
> > 3)
> > Since we have introduced check_new_cluster_replication_origins, do we
> > even need below checks in binary_upgrade_create_replication_origin()?
> > In which conditions will they be hit?
> >
> > + /* Check for OID collision */
> > + ....
> > + if (collides)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > + errmsg("replication origin with ID %u already exists", node_oid)));
> > +
> > + /* Check for name collision */
> > + ....
> > + if (collides)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_DUPLICATE_OBJECT),
> > + errmsg("replication origin \"%s\" already exists",
> > + originname)));
> >
>
> With the restructured code, its not easy to do this now.
>
> >
> > 4)
> > Now since check_new_cluster_subscription_configuration() is purely
> > about checking max-origin GUC configuration, I think the name is
> > misleading. IMO, we should merge it to
> > check_new_cluster_replication_origins(). See how
> > check_new_cluster_replication_slots() does it: checking both
> > new-cluster's count and the max-configuration for slots.
> >
>
> Done.
>
>
> On Tue, Jun 9, 2026 at 9:32 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> > >
> > One more comment:
> >
> > 3. In logical-replication.sgml, we have documentation as:
> > "Commit timestamps and origin data are not preserved during the upgrade."
> > Now since the origin related data is preserved after the upgrade,
> > should we update the documentation here?
> >
>
> Updated.
>
> I have addressed the comments in v8 (attached). I've also now taken
> out the check for PG17 on old cluster and now all clusters which have
> replication origins are migrated.

Hi Ajin,

Thanks for providing the updated patch, I reviewed the patch and I
have some comments:

1. patch is not applying on HEAD and needs a rebase.

2. New line after 'check_new_cluster_replication_origins' is not required:
+ check_new_cluster_replication_origins();
+

3. pg_dumpall with option '--roles-only' also dumps replication origins
eg:
./pg_dumpall -p 9000 --binary-upgrade --roles-only > op.sql
output contains:
-- For binary upgrade, must preserve replication origin roident and remote_lsn
SELECT pg_catalog.binary_upgrade_create_replication_origin('1'::pg_catalog.oid,
'pg_16387'::pg_catalog.text, '0/00000000'::pg_catalog.pg_lsn);

I think in this case we should not dump the replication origins as we
explicitly gave the option of '--roles-only'

4. pg_dumpall with --globals-only option also dumps replication origins.
eg:
./pg_dumpall -p 9000 --binary-upgrade --globals-only> op.sql and
output contains:
-- For binary upgrade, must preserve replication origin roident and remote_lsn
SELECT pg_catalog.binary_upgrade_create_replication_origin('1'::pg_catalog.oid,
'pg_16387'::pg_catalog.text, '0/00000000'::pg_catalog.pg_lsn);

Is this behaviour intentional? Other logical replication objects
(e.g., publications and subscriptions) are not included in pg_dumpall
--globals-only, so is it okay to dump replication origin?

Thanks,
Shlok Kyal

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2026-06-19 08:26:31 Re: Multi-Entry Indexing for GiST & SP-GiST
Previous Message Matthias van de Meent 2026-06-19 07:22:28 Re: Unexpected behavior after OOM errors