Re: Deadlock between logrep apply worker and tablesync worker

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Deadlock between logrep apply worker and tablesync worker
Date: 2023-01-23 05:21:43
Message-ID: CAA4eK1KAO3cUBwim_27DGNCReN8Xm+K9+ktkhh449B=CiiT7aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2023 at 1:29 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> On my machine, the src/test/subscription/t/002_types.pl test
> usually takes right about 1.5 seconds:
>
> $ time make check PROVE_FLAGS=--timer PROVE_TESTS=t/002_types.pl
> ...
> [14:22:12] t/002_types.pl .. ok 1550 ms ( 0.00 usr 0.00 sys + 0.70 cusr 0.25 csys = 0.95 CPU)
> [14:22:13]
>
> I noticed however that sometimes (at least one try in ten, for me)
> it takes about 2.5 seconds:
>
> [14:22:16] t/002_types.pl .. ok 2591 ms ( 0.00 usr 0.00 sys + 0.69 cusr 0.28 csys = 0.97 CPU)
> [14:22:18]
>
> and I've even seen 3.5 seconds. I dug into this and eventually
> identified the cause: it's a deadlock between a subscription's apply
> worker and a tablesync worker that it's spawned. Sometimes the apply
> worker calls wait_for_relation_state_change (to wait for the tablesync
> worker to finish) while it's holding a lock on pg_replication_origin.
> If that's the case, then when the tablesync worker reaches
> process_syncing_tables_for_sync it is able to perform
> UpdateSubscriptionRelState and reach the transaction commit below
> that; but when it tries to do replorigin_drop_by_name a little further
> down, it blocks on acquiring ExclusiveLock on pg_replication_origin.
> So we have an undetected deadlock. We escape that because
> wait_for_relation_state_change has a 1-second timeout, after which
> it rechecks GetSubscriptionRelState and is able to see the committed
> relation state change; so it continues, and eventually releases its
> transaction and the lock, permitting the tablesync worker to finish.
>
> I've not tracked down the exact circumstances in which the apply
> worker ends up holding a problematic lock, but it seems likely
> that it corresponds to cases where its main loop has itself called
> replorigin_drop_by_name, a bit further up, for some other concurrent
> tablesync operation. (In all the cases I've traced through, the apply
> worker is herding multiple tablesync workers when this happens.)
>
> I experimented with having the apply worker release its locks
> before waiting for the tablesync worker, as attached.
>

I don't see any problem with your proposed change but I was wondering
if it would be better to commit the transaction and release locks
immediately after performing the replication origin drop? By doing
that, we will minimize the amount of time the transaction holds the
lock.

> This passes
> check-world and it seems to eliminate the test runtime instability,
> but I wonder whether it's semantically correct. This whole business
> of taking table-wide ExclusiveLock on pg_replication_origin looks
> like a horrid kluge that we should try to get rid of, not least
> because I don't see any clear documentation of what hazard it's
> trying to prevent.
>

IIRC, this is done to prevent concurrent drops of origin drop say by
exposed API pg_replication_origin_drop(). See the discussion in [1]
related to it. If we want we can optimize it so that we can acquire
the lock on the specific origin as mentioned in comments
replorigin_drop_by_name() but it was not clear that this operation
would be frequent enough.

[1] - https://www.postgresql.org/message-id/CAHut%2BPuW8DWV5fskkMWWMqzt-x7RPcNQOtJQBp6SdwyRghCk7A%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2023-01-23 05:39:14 Re: Exit walsender before confirming remote flush in logical replication
Previous Message Michael Paquier 2023-01-23 05:15:15 Re: Add a new pg_walinspect function to extract FPIs from WAL records