Re: Initial Schema Sync for Logical Replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial Schema Sync for Logical Replication
Date: 2023-04-17 03:41:29
Message-ID: CAD21AoBtBDm84qf=MHShkgRawxmf2n-ptjgLkj3r8hcXuu=85A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 7, 2023 at 6:37 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Apr 6, 2023 at 6:57 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Mar 30, 2023 at 10:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Mar 30, 2023 at 12:18 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > How can we postpone creating the pg_subscription_rel entries until the
> > > > tablesync worker starts and does the schema sync? I think that since
> > > > pg_subscription_rel entry needs the table OID, we need either to do
> > > > the schema sync before creating the entry (i.e, during CREATE
> > > > SUBSCRIPTION) or to postpone creating entries as Amit proposed[1]. The
> > > > apply worker needs the information of tables to sync in order to
> > > > launch the tablesync workers, but it needs to create the table schema
> > > > to get that information.
> > >
> > > For the above reason, I think that step 6 of the initial proposal won't work.
> > >
> > > If we can have the tablesync worker create an entry of
> > > pg_subscription_rel after creating the table, it may give us the
> > > flexibility to perform the initial sync. One idea is that we add a
> > > relname field to pg_subscription_rel so that we can create entries
> > > with relname instead of OID if the table is not created yet. Once the
> > > table is created, we clear the relname field and set the OID of the
> > > table instead. It's not an ideal solution but we might make it simpler
> > > later.
> >
> > While writing a PoC patch, I found some difficulties in this idea.
> > First, I tried to add schemaname+relname to pg_subscription_rel but I
> > could not define the primary key of pg_subscription_rel. The primary
> > key on (srsubid, srrelid) doesn't work since srrelid could be NULL.
> > Similarly, the primary key on (srsubid, srrelid, schemaname, relname)
> > also doesn't work.
> >
>
> Can we think of having a separate catalog table say
> pg_subscription_remote_rel for this? You can have srsubid,
> remote_schema_name, remote_rel_name, etc. We may need some other state
> to be maintained during the initial schema sync where this table can
> be used. Basically, this can be used to maintain the state till the
> initial schema sync is complete because we can create a relation entry
> in pg_subscritption_rel only after the initial schema sync is
> complete.

It might not be ideal but I guess it works. But I think we need to
modify the name of replication slot for initial sync as it currently
includes OID of the table:

void
ReplicationSlotNameForTablesync(Oid suboid, Oid relid,
char *syncslotname, Size szslot)
{
snprintf(syncslotname, szslot, "pg_%u_sync_%u_" UINT64_FORMAT, suboid,
relid, GetSystemIdentifier());
}

If we use both schema name and table name, it's possible that slot
names are duplicated if schema and/or table names are long. Another
idea is to use the hash value of schema+table names, but it cannot
completely eliminate that possibility, and probably would make
investigation and debugging hard in case of any failure. Probably we
can use the OID of each entry in pg_subscription_remote_rel instead,
but I'm not sure it's a good idea, mainly because we will end up using
twice as many OIDs as before.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nishant Sharma 2023-04-17 05:30:05 Re: postgres_fdw: wrong results with self join + enable_nestloop off
Previous Message Richard Guo 2023-04-17 02:57:01 Re: Allowing parallel-safe initplans