Re: Single transaction in the tablesync worker?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Single transaction in the tablesync worker?
Date: 2020-12-23 09:45:36
Message-ID: CAA4eK1JH8_k7-wuEEdmOTi+R0U=_kU95tvsTU5GAEEiwu-eKyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 23, 2020 at 11:49 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Amit.
>
> PSA my v7 WIP patch for the Solution1.
>

Few comments:
================
1.
+ * Rarely, the DropSubscription may be issued when a tablesync still
+ * is in SYNCDONE but not yet in READY state. If this happens then
+ * the drop slot could fail because it is already dropped.
+ * In this case suppress and drop slot error.
+ *
+ * FIXME - Is there a better way than this?
+ */
+ if (rstate->state != SUBREL_STATE_SYNCDONE)
+ PG_RE_THROW();

So, does this situation happens when we try to drop subscription after
the state is changed to syncdone but not syncready. If so, then can't
we write a function GetSubscriptionNotDoneRelations similar to
GetSubscriptionNotReadyRelations where we get a list of relations that
are not in done stage. I think this should be safe because once we are
here we shouldn't be allowed to start a new worker and old workers are
already stopped by this function.

2. Your changes in LogicalRepSyncTableStart() doesn't seem to be
right. IIUC, you are copying the table in one transaction, then
updating the state to SUBREL_STATE_COPYDONE in another transaction,
and after that doing replorigin_advance. Consider what happened if we
error out after the first txn is committed in which we have copied the
table. After the restart, it will again try to copy and lead to an
error. Similarly, consider if we error out after the second
transaction, we won't where to start decoding from. I think all these
should be done in a single transaction.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-12-23 10:09:00 Re: Single transaction in the tablesync worker?
Previous Message Peter Smith 2020-12-23 09:38:51 Re: Single transaction in the tablesync worker?