Re: Get stuck when dropping a subscription during synchronizing table

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get stuck when dropping a subscription during synchronizing table
Date: 2017-05-19 13:33:04
Message-ID: 137bd4a9-e9b5-992b-831d-637c6fc73326@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18/05/17 16:16, Robert Haas wrote:
> On Wed, May 17, 2017 at 6:58 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> I think the above changes can solve this issue but It seems to me that
>> holding AccessExclusiveLock on pg_subscription by DROP SUBSCRIPTION
>> until commit could lead another deadlock problem in the future. So I'd
>> to contrive ways to reduce lock level somehow if possible. For
>> example, if we change the apply launcher so that it gets the
>> subscription list only when pg_subscription gets invalid, apply
>> launcher cannot try to launch the apply worker being stopped. We
>> invalidate pg_subscription at commit of DROP SUBSCRIPTION and the
>> apply launcher can get new subscription list which doesn't include the
>> entry we removed. That way we can reduce lock level to
>> ShareUpdateExclusiveLock and solve this issue.
>> Also in your patch, we need to change DROP SUBSCRIPTION as well to
>> resolve another case I encountered, where DROP SUBSCRIPTION waits for
>> apply worker while holding a tuple lock on pg_subscription_rel and the
>> apply worker waits for same tuple on pg_subscription_rel in
>> SetSubscriptionRelState().
>
> I don't really understand the issue being discussed here in any
> detail, but as a general point I'd say that it might be more
> productive to make the locks finer-grained rather than struggling to
> reduce the lock level. For example, instead of locking all of
> pg_subscription, use LockSharedObject() to lock the individual
> subscription, still with AccessExclusiveLock. That means that other
> accesses to that subscription also need to take a lock so that you
> actually get a conflict when there should be one, but that should be
> doable. I expect that trying to manage locking conflicts using only
> catalog-wide locks is a doomed strategy.

We do LockSharedObject() but it's rather useless the way it's done now
as no other access locks it. We can't block all other accesses however,
the workers need to be able to access the catalog during clean shutdown
in some situations. What we need is to block starting of new workers for
that subscription so only those code paths would need to block. So I
think we might want to do both finer-grained locking and decreasing lock
level.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-05-19 14:09:40 Re: Hash Functions
Previous Message Robert Haas 2017-05-19 12:31:15 Re: [HACKERS] Concurrent ALTER SEQUENCE RESTART Regression