Re: Initial Schema Sync for Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial Schema Sync for Logical Replication
Date: 2023-07-09 03:59:04
Message-ID: CAA4eK1J=k1s=k7py=1Hme5avjWUNQ68gr-TDuZCBd-mgLNHdWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 5, 2023 at 7:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jun 19, 2023 at 5:29 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > Below are my review comments for the PoC patch 0001.
> >
> > In addition, the patch needed rebasing, and, after I rebased it
> > locally in my private environment there were still test failures:
> > a) The 'make check' tests fail but only in a minor way due to changes colname
> > b) the subscription TAP test did not work at all for me -- many errors.
>
> Thank you for reviewing the patch.
>
> While updating the patch, I realized that the current approach won't
> work well or at least has the problem with partition tables. If a
> publication has a partitioned table with publish_via_root = false, the
> subscriber launches tablesync workers for its partitions so that each
> tablesync worker copies data of each partition. Similarly, if it has a
> partition table with publish_via_root = true, the subscriber launches
> a tablesync worker for the parent table. With the current design,
> since the tablesync worker is responsible for both schema and data
> synchronization for the target table, it won't be possible to
> synchronize both the parent table's schema and partitions' schema.
>

I think one possibility to make this design work is that when
publish_via_root is false, then we assume that subscriber already has
parent table and then the individual tablesync workers can sync the
schema of partitions and their data. And when publish_via_root is
true, then the table sync worker is responsible to sync parent and
child tables along with data. Do you think such a mechanism can
address the partition table related cases?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-07-09 04:17:03 Re: Check lateral references within PHVs for memoize cache keys
Previous Message Zhang Mingli 2023-07-09 03:51:44 Re: [feature]COPY FROM enable FORCE_NULL/FORCE_NOT_NULL on all columns