Re: [HACKERS] Two pass CheckDeadlock in contentent case

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Two pass CheckDeadlock in contentent case
Date: 2018-07-23 12:38:14
Message-ID: CANP8+j+dS5BEYDuTqjLE=icoieL0GajzVvF17mrCWt_tP86qww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3 October 2017 at 15:30, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru> wrote:

> If hundreds of backends reaches this timeout trying to acquire
> advisory lock on a same value, it leads to hard-stuck for many
> seconds, cause they all traverse same huge lock graph under
> exclusive lock.
> During this stuck there is no possibility to do any meaningful
> operations (no new transaction can begin).

Well observed, we clearly need to improve this.

> Attached patch makes CheckDeadlock to do two passes:
> - first pass uses LW_SHARED on partitions of lock hash.
> DeadLockCheck is called with flag "readonly", so it doesn't
> modify anything.
> - If there is possibility of "soft" or "hard" deadlock detected,
> ie if there is need to modify lock graph, then partitions
> relocked with LW_EXCLUSIVE, and DeadLockCheck is called again.
>
> It fixes hard-stuck, cause backends walk lock graph under shared
> lock, and found that there is no real deadlock.

In phase 2, does this relock only the partitions required to reorder
the lock graph, or does it request all locks? Fewer locks would be
better.

If you decide to reorder the lock graph, then only one backend should
attempt this at a time. We should keep track of reorder-requests, so
if two backends arrive at the same conclusion then only one should
proceed to do this.

Many deadlocks happen between locks in same table. It would also be a
useful optimization to check just one partition for lock graphs before
we checked all partitions.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Klychkov 2018-07-23 13:14:00 Re[2]: Alter index rename concurrently to
Previous Message Fabien COELHO 2018-07-23 11:47:44 Re: pgbench: improve --help and --version parsing