Quick Links

RE: Initial Schema Sync for Logical Replication

From:	"Kumar, Sachin" <ssetiya(at)amazon(dot)com>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	RE: Initial Schema Sync for Logical Replication
Date:	2023-04-05 13:25:12
Message-ID:	961467ad0fd94bd0bc2036364d1e40cb@amazon.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> > >
> > > 3. The apply worker launches the tablesync workers for tables that
> > > need to be synchronized.
> > >
> > > There might be DDLs executed on the publisher for tables before the
> > > tablesync worker starts. But the apply worker needs to apply DDLs
> > > for pre-data database objects. OTOH, it can ignore DDLs for
> > > not-synced-yet tables and other database objects such as INDEX,
> TRIGGER, RULE, etc (i.e. post-data).
> > >
> > > 4. The tablesync worker creates its replication slot, dumps+restores
> > > the table schema, update the pg_subscription_rel, and perform COPY.
> > >
> > > These operations should be done in the same transaction.
> >
> > pg_restore wont be rollbackable, So we need to maintain states in
> pg_subscription_rel.
>
> Yes. But I think it depends on how we restore them. For example, if we have
> the tablesync worker somethow restore the table using a new SQL function
> returning the table schema as we discussed or executing the dump file via
> SPI, we can do that in the same transaction.

okay

>
> >
> > >
> > > 5. After finishing COPY, the tablesync worker dumps indexes (and
> > > perhaps
> > > constraints) of the table and creates them (which possibly takes a long
> time).
> > > Then it starts to catch up, same as today. The apply worker needs to
> > > wait for the tablesync worker to catch up.
> >
> > I don’t think we can have CATCHUP stage. We can have a DDL on
> > publisher which can add a new column (And this DDL will be executed by
> > applier later). Then we get a INSERT because we have old definition of
> table, insert will fail.
>
> All DMLs and DDLs associated with the table being synchronized are applied
> by the tablesync worker until it catches up with the apply worker.

Right, Sorry I forgot that in above case if definition on publisher changes we will also have a
corresponding DDLs.

>
> >
> > >
> > > We need to repeat these steps until we complete the initial data
> > > copy and create indexes for all tables, IOW until all pg_subscription_rel
> status becomes READY.
> > >
> > > 6. If the apply worker confirms all tables are READY, it starts
> > > another sync worker who is responsible for the post-data database
> > > objects such as TRIGGER, RULE, POLICY etc (i.e. post-data).
> > >
> > > While the sync worker is starting up or working, the apply worker
> > > applies changes for pre-data database objects as well as READY tables.
> > We might have some issue if we have create table like Create
> > table_name as select * from materialized_view.
>
> Could you elaborate on the scenario where we could have an issue with such
> DDL?

Since materialized view of publisher has not been created by subscriber yet
So if we have a DDL which does a create table using a materialized view
it will fail. I am not sure how DDL patch is handling create table as statements.
If it is modified to become like a normal CREATE TABLE then we wont have any issues.

>
> > >
> > > 7. Similar to the tablesync worker, this sync worker creates its
> > > replication slot and sets the returned LSN somewhere, say
> pg_subscription.
> > >
> > > 8. The sync worker dumps and restores these objects. Which could
> > > take a time since it would need to create FK constraints. Then it
> > > starts to catch up if the apply worker is ahead. The apply worker waits for
> the sync worker to catch up.
> > >
> > > 9. Once the sync worker catches up, the apply worker starts applying
> > > changes for all database objects.
> > >
> > > IIUC with this approach, we can resolve the concurrent DDL problem
> > > Sachin mentioned, and indexes (and constraints) are created after the
> initial data copy.
> > >
> > > The procedures are still very complex and not fully considered yet
> > > but I hope there are some useful things at least for discussion.
> > >
> > > Probably we can start with supporting only tables. In this case, we
> > > would not need the post-data phase (i.e. step 6-9). It seems to me
> > > that we need to have the subscription state somewhere (maybe
> > > pg_subscription) so that the apply worker figure out the next step.
> > > Since we need to dump and restore different objects on different
> > > timings, we probably cannot directly use pg_dump/pg_restore. I've
> > > not considered how the concurrent REFRESH PUBLICATION works.
> >
> > I think above prototype will work and will have least amount of side
> > effects, but It might be too complex to implement and I am not sure about
> corner cases.
> >
> > I was thinking of other ways of doing Initial Sync , which are less
> > complex but each with separate set of bottlenecks
> >
> > On Publisher Side:-
> > 1) Locking the publisher:- Easiest one to implement, applier process
> > will get Access Shared lock on the all the published tables. (We don't
> > have to worry newly created concurrent table) As tableSync will finish
> > syncing the table, it will release table lock, So we will release table locks in
> steps. Users can still perform DML on tables, but DDLs wont be allowed.
>
> Do you mean that the apply worker acquires table locks and the tablesync
> workers release them? If so, how can we implement it?
>

I think releasing lock in steps would be impossible (given postgres lock implementations)
So applier process has to create a new transaction and lock all the published tables in
access shared mode. And after tableSync is completed transaction will be committed to release
locks. So 1 and 3 are similar we have to keep one transaction open till table are synced.

> >
> > 2) Using pg_dump/pg_restore for schema and data:- As Amit mentioned
> we
> > can use pg_dump/ pg_restore [1], Although it might have side effect of
> > using double storage , we can table pg_dump of each table separately and
> delete the dump as soon as table is synced.
> > tableSync process will read the dump and call pg_restore on the table.
> > If we crash in middle of restoring the tables we can start
> > pg_dump(--clean)/restore again with left out tables.
> > With this we can reduce space usage but we might create too many files.
>
> With this idea, who does pg_dump and pg_restore? and when do we create
> pg_subscription_rel entries?

Applier process will do pg_dump/pg_restore . pg_subscription_rel entries can be created
after pg_restore, We can create a new column with rel_nam and keep oid empty As you have
suggested earlier.

>
> >
> > 3) Using publisher snapshot:- Applier process will do
> > pg_dump/pg_restore as usual, Then applier process will start a new
> > process P1 which will connect to publisher and start a transaction ,
> > it will call pg_export_snapshot() to export the snapshot.Then applier
> > process will take snapshot string and pass it to the tableSync process
> > as a argument. tableSync will use this snapshot for COPY TABLE.
> > tableSync should only do COPY TABLE and then will exit , So we wont do
> any catchup phase in tableSync. After all tables finish COPY table transaction
> will be committed by P1 process and it will exit.
> > In the case of crash/restart we can simple start from beginning since
> > nothing is committed till every table is synced. There are 2 main
> > issues with this approach 1. I am not sure what side-effects we might
> > have on publisher since we might have to keep the transaction open for
> long time.
>
> I'm concerned that it would not be an acceptable downside that we keep a
> transaction open until all tables are synchronized.
>

Okay, There is one more issue just using same snapshot will not stop table DDL
modifications, we need to have atleast access share lock on each tables.
So this will make tables locked on publisher, So this is essentially same as 1.

Regards
Sachin

In response to

Re: Initial Schema Sync for Logical Replication at 2023-04-03 15:15:29 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2023-04-05 13:28:36	Re: proposal: psql: show current user in prompt
Previous Message	Amit Kapila	2023-04-05 13:15:25	Re: Minimal logical decoding on standbys