Re: Get stuck when dropping a subscription during synchronizing table

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get stuck when dropping a subscription during synchronizing table
Date: 2017-05-09 17:46:50
Message-ID: CAD21AoChkuWzVbry35zn6vMyLo0ff6kTzEkPGOrrSH1qpr9QkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 8, 2017 at 8:42 PM, Petr Jelinek
<petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
> On 08/05/17 11:27, Masahiko Sawada wrote:
>> Hi,
>>
>> I encountered a situation where DROP SUBSCRIPTION got stuck when
>> initial table sync is in progress. In my environment, I created
>> several tables with some data on publisher. I created subscription on
>> subscriber and drop subscription immediately after that. It doesn't
>> always happen but I often encountered it on my environment.
>>
>> ps -x command shows the following.
>>
>> 96796 ? Ss 0:00 postgres: masahiko postgres [local] DROP
>> SUBSCRIPTION
>> 96801 ? Ts 0:00 postgres: bgworker: logical replication
>> worker for subscription 40993 waiting
>> 96805 ? Ss 0:07 postgres: bgworker: logical replication
>> worker for subscription 40993 sync 16418
>> 96806 ? Ss 0:01 postgres: wal sender process masahiko [local] idle
>> 96807 ? Ss 0:00 postgres: bgworker: logical replication
>> worker for subscription 40993 sync 16421
>> 96808 ? Ss 0:00 postgres: wal sender process masahiko [local] idle
>>
>> The DROP SUBSCRIPTION process (pid 96796) is waiting for the apply
>> worker process (pid 96801) to stop while holding a lock on
>> pg_subscription_rel. On the other hand the apply worker is waiting for
>> acquiring a tuple lock on pg_subscription_rel needed for heap_update.
>> Also table sync workers (pid 96805 and 96807) are waiting for the
>> apply worker process to change their status.
>>
>
> Looks like we should kill apply before dropping dependencies.

Sorry, after investigated I found out that DROP SUBSCRIPTION process
is holding AccessExclusiveLock on pg_subscription (, not
pg_subscription_rel) and apply worker is waiting for acquiring a lock
on it. So I guess that the dropping dependencies are not relevant with
this. It seems to me that the main cause is that DROP SUBSCRIPTION
waits for apply worker to finish while keeping to hold
AccessExclusiveLock on pg_subscription. Perhaps we need to contrive
ways to reduce lock level somehow.

>
>> Also, even when DROP SUBSCRIPTION is done successfully, the table sync
>> worker can be orphaned because I guess that the apply worker can exit
>> before change status of table sync worker.
>
> Well the tablesync worker should stop itself if the subscription got
> removed, but of course again the dependencies are an issue, so we should
> probably kill those explicitly as well.

Yeah, I think that we should ensure that the apply worker exits after
killed all involved table sync workers.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2017-05-09 17:54:37 Re: snapbuild woes
Previous Message Mark Dilger 2017-05-09 17:25:43 Re: idea: custom log_line_prefix components besides application_name