Re: Get stuck when dropping a subscription during synchronizing table

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get stuck when dropping a subscription during synchronizing table
Date: 2017-06-15 01:22:55
Message-ID: CAD21AoBW4U_D+g6QB=WpHhJxS2cEZB7XHXYZc2jgVVMAATM_PA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 15, 2017 at 7:35 AM, Petr Jelinek
<petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
> On 13/06/17 21:49, Peter Eisentraut wrote:
>> On 6/13/17 02:33, Noah Misch wrote:
>>>> Steps to reproduce -
>>>> X cluster -> create 100 tables , publish all tables (create publication pub
>>>> for all tables);
>>>> Y Cluster -> create 100 tables ,create subscription(create subscription sub
>>>> connection 'user=centos host=localhost' publication pub;
>>>> Y cluster ->drop subscription - drop subscription sub;
>>>>
>>>> check the log file on Y cluster.
>>>>
>>>> Sometime , i have seen this error on psql prompt and drop subscription
>>>> operation got failed at first attempt.
>>>>
>>>> postgres=# drop subscription sub;
>>>> ERROR: tuple concurrently updated
>>>> postgres=# drop subscription sub;
>>>> NOTICE: dropped replication slot "sub" on publisher
>>>> DROP SUBSCRIPTION
>>>
>>> [Action required within three days. This is a generic notification.]
>>
>> It's being worked on. Let's see by Thursday.
>>
>
> Attached fixes it (it was mostly about order of calls). I also split the
> SetSubscriptionRelState into 2 separate interface while I was changing
> it, because now that the update_only bool was added it has become quite
> strange to have single interface for what is basically two separate
> functions.
>
> There are still couple of remaining issues from this thread though.
> Namely the AccessExclusiveLock of the pg_subscription catalog which is
> not very pretty, but we need a way to block launcher from accessing the
> subscription which is being dropped and make sure it will not start new
> workers for it afterwards. Question is how however as by the time
> launcher can lock individual subscription it is already processing it.
> So it looks to me like we'd need to reread the catalog with new snapshot
> after the lock was acquired which seems bit wasteful (I wonder if we
> could just AcceptInvalidationMessages and refetch from syscache). Any
> better ideas?
>
> Other related problem is locking of subscriptions during operations on
> them, especially AlterSubscription seems like it should lock the
> subscription itself. I did that in 0002.
>

Thank you for the patch! Sorry I don't have a time for it today but
I'll review these patches tomorrow.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-06-15 01:53:06 Re: GSoC 2017 weekly progress reports (week 2)
Previous Message Tom Lane 2017-06-15 00:36:21 Assorted leaks of PGresults