Re: Single transaction in the tablesync worker?

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Single transaction in the tablesync worker?
Date: 2021-01-07 08:05:59
Message-ID: CAHut+PtN18hcM_jiD8PZb_j-CUZzNHgW8om37MsZqt_qGkRRpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 4, 2021 at 8:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Dec 30, 2020 at 11:51 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Wed, Dec 23, 2020 at 8:43 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > 1.
> > > + * Rarely, the DropSubscription may be issued when a tablesync still
> > > + * is in SYNCDONE but not yet in READY state. If this happens then
> > > + * the drop slot could fail because it is already dropped.
> > > + * In this case suppress and drop slot error.
> > > + *
> > > + * FIXME - Is there a better way than this?
> > > + */
> > > + if (rstate->state != SUBREL_STATE_SYNCDONE)
> > > + PG_RE_THROW();
> > >
> > > So, does this situation happens when we try to drop subscription after
> > > the state is changed to syncdone but not syncready. If so, then can't
> > > we write a function GetSubscriptionNotDoneRelations similar to
> > > GetSubscriptionNotReadyRelations where we get a list of relations that
> > > are not in done stage. I think this should be safe because once we are
> > > here we shouldn't be allowed to start a new worker and old workers are
> > > already stopped by this function.
> >
> > Yes, but I don't see how adding such a function is an improvement over
> > the existing code:
> >
>
> The advantage is that we don't need to use try..catch to deal with
> such conditions which I don't think is a good way to deal with such
> cases. Also, even after using try...catch, still, we can leak the
> slots because the patch drops the slot after changing the state to
> syncdone and if there is any error while dropping the slot, it simply
> skips it. So, it is possible that the rel state is syncdone but the
> slot still exists and we get an error due to some different reason,
> and then we will silently skip it again and allow the subscription to
> be dropped.
>
> I think instead what we should do is to drop the slot before we change
> the rel state to syncdone. Also, if the apply workers fail to drop the
> slot, it should try to again drop it after restart. In
> DropSubscription, we can then check if the rel state is not SYNC or
> READY, we can drop the corresponding slots.
>

Fixed as suggested in latest patch [v12]

----

[v12] = https://www.postgresql.org/message-id/CAHut%2BPsonJzarxSBWkOM%3DMjoEpaq53ShBJoTT9LHJskwP3OvZA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-01-07 08:08:21 Re: Single transaction in the tablesync worker?
Previous Message Shinya11.Kato 2021-01-07 07:55:36 RE: Wrong statistics for size of XLOG_SWITCH during pg_waldump.