Re: Deadlock between backend and recovery may not be detected

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Victor Yegorov <vyegorov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deadlock between backend and recovery may not be detected
Date: 2020-12-22 01:25:58
Message-ID: CAD21AoBRytRGj0TbQVsmD-sqWxBZz9-h-KwO5YDFezjCqXYG2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
>
> On 2020/12/17 2:15, Fujii Masao wrote:
> >
> >
> > On 2020/12/16 23:28, Drouvot, Bertrand wrote:
> >> Hi,
> >>
> >> On 12/16/20 2:36 PM, Victor Yegorov wrote:
> >>>
> >>> *CAUTION*: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >>>
> >>>
> >>> ср, 16 дек. 2020 г. в 13:49, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com <mailto:masao(dot)fujii(at)oss(dot)nttdata(dot)com>>:
> >>>
> >>> After doing this procedure, you can see the startup process and backend
> >>> wait for the table lock each other, i.e., deadlock. But this deadlock remains
> >>> even after deadlock_timeout passes.
> >>>
> >>> This seems a bug to me.
> >>>
> >> +1
> >>
> >>>
> >>> > * Deadlocks involving the Startup process and an ordinary backend process
> >>> > * will be detected by the deadlock detector within the ordinary backend.
> >>>
> >>> The cause of this issue seems that ResolveRecoveryConflictWithLock() that
> >>> the startup process calls when recovery conflict on lock happens doesn't
> >>> take care of deadlock case at all. You can see this fact by reading the above
> >>> source code comment for ResolveRecoveryConflictWithLock().
> >>>
> >>> To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
> >>> timer in ResolveRecoveryConflictWithLock() so that the startup process can
> >>> send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
> >>> Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
> >>> the backend should check whether the deadlock actually happens or not.
> >>> Attached is the POC patch implimenting this.
> >>>
> >> good catch!
> >>
> >> I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be set in ResolveRecoveryConflictWithLock() too (it is already set in ResolveRecoveryConflictWithBufferPin()).
> >>
> >> So + 1 to consider this as a bug and for the way the patch proposes to fix it.
> >
> > Thanks Victor and Bertrand for agreeing!
> > Attached is the updated version of the patch.
>
> Attached is v3 of the patch. Could you review this version?
>
> While the startup process is waiting for recovery conflict on buffer pin,
> it repeats sending the request for deadlock check to all the backends
> every deadlock_timeout. This may increase the workload in the startup
> process and backends, but since this is the original behavior, the patch
> doesn't change that. Also in practice this may not be so harmful because
> the period that the buffer is kept pinned is basically not so long.
>

@@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
*/
ProcWaitForSignal(PG_WAIT_BUFFER_PIN);

+ if (got_standby_deadlock_timeout)
+ {
+ /*
+ * Send out a request for hot-standby backends to check themselves for
+ * deadlocks.
+ *
+ * XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait
+ * to be signaled by UnpinBuffer() again and send a request for
+ * deadlocks check if deadlock_timeout happens. This causes the
+ * request to continue to be sent every deadlock_timeout until the
+ * buffer is unpinned or ltime is reached. This would increase the
+ * workload in the startup process and backends. In practice it may
+ * not be so harmful because the period that the buffer is kept pinned
+ * is basically no so long. But we should fix this?
+ */
+ SendRecoveryConflictWithBufferPin(
+
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
+ got_standby_deadlock_timeout = false;
+ }
+

Since SendRecoveryConflictWithBufferPin() sends the signal to all
backends every backend who is waiting on a lock at ProcSleep() and not
holding a buffer pin blocking the startup process will end up doing a
deadlock check, which seems expensive. What is worse is that the
deadlock will not be detected because the deadlock involving a buffer
pin isn't detected by CheckDeadLock(). I thought we can replace
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
backend who has a buffer pin blocking the startup process and not
waiting on a lock is also canceled after deadlock_timeout. We can have
the backend return from RecoveryConflictInterrupt() when it received
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock,
but it’s also not good because we cannot cancel the backend after
max_standby_streaming_delay that has a buffer pin blocking the startup
process. So I wonder if we can have a new signal. When the backend
received this signal it returns from RecoveryConflictInterrupt()
without deadlock check either if it’s not waiting on any lock or if it
doesn’t have a buffer pin blocking the startup process. Otherwise it's
cancelled.

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-12-22 01:29:03 Re: Better client reporting for "immediate stop" shutdowns
Previous Message Tomas Vondra 2020-12-22 01:07:09 Re: Fix generate_useful_gather_paths for parallel unsafe pathkeys