Re: Single transaction in the tablesync worker?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Single transaction in the tablesync worker?
Date: 2020-12-05 02:04:35
Message-ID: CAA4eK1JcO0wU-RgBm-_XJWGdsnNzB0iMVF=+SyPxaPUnQ56Bgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 4, 2020 at 7:12 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Thu, Dec 3, 2020 at 7:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Dec 3, 2020 at 7:04 PM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Dec 3, 2020 at 2:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > The tablesync worker in logical replication performs the table data
> > > > sync in a single transaction which means it will copy the initial data
> > > > and then catch up with apply worker in the same transaction. There is
> > > > a comment in LogicalRepSyncTableStart ("We want to do the table data
> > > > sync in a single transaction.") saying so but I can't find the
> > > > concrete theory behind the same. Is there any fundamental problem if
> > > > we commit the transaction after initial copy and slot creation in
> > > > LogicalRepSyncTableStart and then allow the apply of transactions as
> > > > it happens in apply worker? I have tried doing so in the attached (a
> > > > quick prototype to test) and didn't find any problems with regression
> > > > tests. I have tried a few manual tests as well to see if it works and
> > > > didn't find any problem. Now, it is quite possible that it is
> > > > mandatory to do the way we are doing currently, or maybe something
> > > > else is required to remove this requirement but I think we can do
> > > > better with respect to comments in this area.
> > >
> > > If we commit the initial copy, the data upto the initial copy's
> > > snapshot will be visible downstream. If we apply the changes by
> > > committing changes per transaction, the data visible to the other
> > > transactions will differ as the apply progresses.
> > >
> >
> > It is not clear what you mean by the above. The way you have written
> > appears that you are saying that instead of copying the initial data,
> > I am saying to copy it transaction-by-transaction. But that is not the
> > case. I am saying copy the initial data by using REPEATABLE READ
> > isolation level as we are doing now, commit it and then process
> > transaction-by-transaction till we reach sync-point (point till where
> > apply worker has already received the data).
>
> Craig in his mail has clarified this. The changes after the initial
> COPY will be visible before the table sync catches up.
>

I think the problem is not that the changes are visible after COPY
rather it is that we don't have a mechanism to restart if it crashes
after COPY unless we do all the sync up in one transaction. Assume we
commit after COPY and then process transaction-by-transaction and it
errors out (due to connection loss) or crashes, in-between one of the
following transactions after COPY then after the restart we won't know
from where to start for that relation. This is because the catalog
(pg_subscription_rel) will show the state as 'd' (data is being
copied) and the slot would have gone as it was a temporary slot. But
as mentioned in one of my emails above [1] we can solve these problems
which Craig also seems to be advocating for as there are many
advantages of not doing the entire sync (initial copy + stream changes
for that relation) in one single transaction. It will allow us to
support decode of prepared xacts in the subscriber. Also, it seems
pglogical already does processing transaction-by-transaction after the
initial copy. The only thing which is not clear to me is why we
haven't decided to go ahead initially and it would be probably better
if the original authors would also chime-in to at least clarify the
same.

> >
> > > You haven't
> > > clarified whether we will respect the transaction boundaries in the
> > > apply log or not. I assume we will.
> > >
> >
> > It will be transaction-by-transaction.
> >
> > > Whereas if we apply all the
> > > changes in one go, other transactions either see the data before
> > > resync or after it without any intermediate states.
> > >
> >
> > What is the problem even if the user is able to see the data after the
> > initial copy?
> >
> > > That will not
> > > violate consistency, I think.
> > >
> >
> > I am not sure how consistency will be broken.
>
> Some of the transactions applied by apply workers may not have been
> applied by the resync and vice versa. If the intermediate states of
> table resync worker are visible, this difference in applied
> transaction will result in loss of consistency if those transactions
> are changing the table being resynced and some other table in the same
> transaction. The changes won't be atomically visible. Thinking more
> about this, this problem exists today for a table being resynced, but
> at least it's only the table being resynced that is behind the other
> tables so it's predictable.
>

Yeah, I have already shown that this problem [1] exists today and it
won't be predictable when the number of tables to be synced are more.
I am not sure why but it seems acceptable to original authors that the
data of transactions are visibly partially during the initial
synchronization phase for a subscription. I don't see it documented
clearly either.

[1] - https://www.postgresql.org/message-id/CAA4eK1Ld9XaLoTZCoKF_gET7kc1fDf8CPR3CM48MQb1N1jDLYg%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-12-05 02:08:03 Re: Proposed patch for key managment
Previous Message Michael Paquier 2020-12-05 01:43:54 Re: A few new options for CHECKPOINT