Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Andres Freund <andres(at)anarazel(dot)de>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2018-06-04 18:42:23
Message-ID: 20180604184223.fph4fhl6kyfam7lq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-06-04 16:47:29 +0300, Konstantin Knizhnik wrote:
> We in PostgresProc were faced with lock extension contention problem at two
> more customers and tried to use this patch (v13) to address this issue.
> Unfortunately replacing heavy lock with lwlock couldn't completely eliminate
> contention, now most of backends are blocked on conditional variable:
>
> 0x00007fb03a318903 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #0  0x00007fb03a318903 in __epoll_wait_nocancel () from /lib64/libc.so.6
> #1  0x00000000007024ee in WaitEventSetWait ()
> #2  0x0000000000718fa6 in ConditionVariableSleep ()
> #3  0x000000000071954d in RelExtLockAcquire ()

That doesn't necessarily mean that the postgres code is to fault
here. It's entirely possible that the filesystem or storage is the
bottleneck. Could you briefly describe workload & hardware?

> Second problem we observed was even more critical: if backed is granted
> relation extension lock and then got some error before releasing this lock,
> then abort of the current transaction doesn't release this lock (unlike
> heavy weight lock) and the relation is kept locked.
> So database is actually stalled and server has to be restarted.

That obvioulsy needs to be fixed...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-06-04 18:52:05 Re: Spilling hashed SetOps and aggregates to disk
Previous Message Jason Petersen 2018-06-04 18:41:28 Re: Code of Conduct plan