Re: Initial Schema Sync for Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial Schema Sync for Logical Replication
Date: 2023-03-29 11:07:43
Message-ID: CAA4eK1LtMpX=i18Ph4mnMGyK5pR=M+VQFmkz6aUP49knGaRJhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 28, 2023 at 8:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Mar 28, 2023 at 6:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > >
> > > > > > I think we can have same issues as you mentioned New table t1 is added
> > > > > > to the publication , User does a refresh publication.
> > > > > > pg_dump / pg_restore restores the table definition. But before
> > > > > > tableSync can start, steps from 2 to 5 happen on the publisher.
> > > > > > > 1. Create Table t1(c1, c2); --LSN: 90 2. Insert t1 (1, 1); --LSN 100
> > > > > > > 3. Insert t1 (2, 2); --LSN 110 4. Alter t1 Add Column c3; --LSN 120
> > > > > > > 5. Insert t1 (3, 3, 3); --LSN 130
> > > > > > And table sync errors out
> > > > > > There can be one more issue , since we took the pg_dump without
> > > > > snapshot (wrt to replication slot).
> > > > > >
> > > > >
> > > > > To avoid both the problems mentioned for Refresh Publication, we can do
> > > > > one of the following: (a) create a new slot along with a snapshot for this
> > > > > operation and drop it afterward; or (b) using the existing slot, establish a
> > > > > new snapshot using a technique proposed in email [1].
> > > > >
> > > >
> > > > Thanks, I think option (b) will be perfect, since we don’t have to create a new slot.
> > >
> > > Regarding (b), does it mean that apply worker stops streaming,
> > > requests to create a snapshot, and then resumes the streaming?
> > >
> >
> > Shouldn't this be done by the backend performing a REFRESH publication?
>
> Hmm, I might be missing something but the idea (b) uses the existing
> slot to establish a new snapshot, right? What existing replication
> slot do we use for that? I thought it was the one used by the apply
> worker.
>

Right, it will be the same as the one for apply worker. I think if we
decide to do initial sync via apply worker then in this case also, we
need to let apply worker restart and perform initial sync as the first
thing.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2023-03-29 11:51:36 BufmgrCommit no-op since 2008, remaining uses?
Previous Message Kumar, Sachin 2023-03-29 10:57:49 RE: Initial Schema Sync for Logical Replication