Re: Get stuck when dropping a subscription during synchronizing table

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robertmhaas(at)gmail(dot)com
Cc: sawada(dot)mshk(at)gmail(dot)com, petr(dot)jelinek(at)2ndquadrant(dot)com, michael(dot)paquier(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get stuck when dropping a subscription during synchronizing table
Date: 2017-05-19 03:55:55
Message-ID: 20170519.125555.244660790.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Thu, 18 May 2017 10:16:35 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmobJk9QWkHp98pxWk8rMe-EC8BVdE6F9zPH6Yt1dbAGYBg(at)mail(dot)gmail(dot)com>
> On Wed, May 17, 2017 at 6:58 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > I think the above changes can solve this issue but It seems to me that
> > holding AccessExclusiveLock on pg_subscription by DROP SUBSCRIPTION
> > until commit could lead another deadlock problem in the future. So I'd
> > to contrive ways to reduce lock level somehow if possible. For
> > example, if we change the apply launcher so that it gets the
> > subscription list only when pg_subscription gets invalid, apply
> > launcher cannot try to launch the apply worker being stopped. We
> > invalidate pg_subscription at commit of DROP SUBSCRIPTION and the
> > apply launcher can get new subscription list which doesn't include the
> > entry we removed. That way we can reduce lock level to
> > ShareUpdateExclusiveLock and solve this issue.
> > Also in your patch, we need to change DROP SUBSCRIPTION as well to
> > resolve another case I encountered, where DROP SUBSCRIPTION waits for
> > apply worker while holding a tuple lock on pg_subscription_rel and the
> > apply worker waits for same tuple on pg_subscription_rel in
> > SetSubscriptionRelState().

Sorry, I don't have enough time to consider this
profoundly. Perhaps will return later.

> I don't really understand the issue being discussed here in any
> detail, but as a general point I'd say that it might be more
> productive to make the locks finer-grained rather than struggling to
> reduce the lock level. For example, instead of locking all of
> pg_subscription, use LockSharedObject() to lock the individual
> subscription, still with AccessExclusiveLock. That means that other
> accesses to that subscription also need to take a lock so that you
> actually get a conflict when there should be one, but that should be
> doable. I expect that trying to manage locking conflicts using only
> catalog-wide locks is a doomed strategy.

Thank you for the suggestion. I think it is a bit differnt from
that. The problem here is that a replication worker may try
reading exactly the tuple for the subscription being deleted just
before responding to a received termination request. So the
finer-graind lock doesn't help.

The focus of resolving this is preventing blocking of workers
caused by DROP SUBSCRIPTION. So Sadasan's patch immediately
released the lock on pg_subscrption and uses another lock for
exclusion. My patch just give up to read the catalog when not
available.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Huong Dangminh 2017-05-19 04:10:09 Re: [ANNOUNCE] PostgreSQL 10 Beta 1 Released!
Previous Message Bossart, Nathan 2017-05-19 03:55:01 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands