Re: Single transaction in the tablesync worker?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Single transaction in the tablesync worker?
Date: 2020-12-04 09:02:33
Message-ID: CAA4eK1Lbe8Abc6t5t2ctrw_Gzso1e2NyDPLd1dOxj_my7j1ecA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 4, 2020 at 10:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Dec 4, 2020 at 7:53 AM Craig Ringer
> <craig(dot)ringer(at)enterprisedb(dot)com> wrote:
> >
> > On Thu, 3 Dec 2020 at 17:25, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > Is there any fundamental problem if
> > > we commit the transaction after initial copy and slot creation in
> > > LogicalRepSyncTableStart and then allow the apply of transactions as
> > > it happens in apply worker?
> >
> > No fundamental problem. Both approaches are fine. Committing the
> > initial copy then doing the rest in individual txns means an
> > incomplete sync state for the table becomes visible, which may not be
> > ideal. Ideally we'd do something like sync the data into a clone of
> > the table then swap the table relfilenodes out once we're synced up.
> >
> > IMO the main advantage of committing as we go is that it would let us
> > use a non-temporary slot and support recovering an incomplete sync and
> > finishing it after interruption by connection loss, crash, etc. That
> > would be advantageous for big table syncs or where the sync has lots
> > of lag to replay. But it means we have to remember sync states, and
> > give users a way to cancel/abort them. Otherwise forgotten temp slots
> > for syncs will cause a mess on the upstream.
> >
> > It also allows the sync slot to advance, freeing any held upstream
> > resources before the whole sync is done, which is good if the upstream
> > is busy and generating lots of WAL.
> >
> > Finally, committing as we go means we won't exceed the cid increment
> > limit in a single txn.
> >
>
>
> Yeah, all these are advantages of processing
> transaction-by-transaction. IIUC, we need to primarily do two things
> to achieve it, one is to have an additional state in the catalog (say
> catch up) which will say that the initial copy is done. Then we need
> to have a permanent slot using which we can track the progress of the
> slot so that after restart (due to crash, connection break, etc.) we
> can start from the appropriate position.
>
> Apart from the above, I think with the current design of tablesync we
> can see partial data of transactions because we allow all the
> tablesync workers to run parallelly. Consider the below scenario:
>
..
..
>
> Basically, the results for Tx1, Tx2, Tx3 are visible for mytbl2 but
> not for mytbl1. To reproduce this I have stopped the tablesync workers
> (via debugger) for mytbl1 and mytbl2 in LogicalRepSyncTableStart
> before it changes the relstate to SUBREL_STATE_SYNCWAIT. Then allowed
> Tx2 and Tx3 to be processed by apply worker and then allowed tablesync
> worker for mytbl2 to proceed. After that, I can see the above state.
>
> Now, won't this behavior be considered as transaction inconsistency
> where partial transaction data or later transaction data is visible? I
> don't think we can have such a situation on the master (publisher)
> node or in physical standby.
>

On briefly checking the pglogical code [1], it seems this problem
won't be there in pglogical. Because it seems to first copy all the
tables (via pglogical_sync_table) in one process and then catch with
the apply worker in a transaction-by-transaction manner. Am, I reading
it correctly? If so then why we followed a different approach for
in-core solution or is it that the pglogical has improved over time
but all the improvements can't be implemented in-core because of some
missing features?

[1] - https://github.com/2ndQuadrant/pglogical

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-12-04 09:05:03 Re: Remove incorrect assertion in reorderbuffer.c.
Previous Message Hou, Zhijie 2020-12-04 08:31:40 RE: A new function to wait for the backend exit after termination