Double partition lock in bufmgr

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Double partition lock in bufmgr
Date: 2020-12-18 12:20:34
Message-ID: f4f2af4b-246b-4409-0b8d-6b39da064175@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I am investigating incident with one of out customers: performance of
the system isdropped dramatically.
Stack traces of all backends can be found here:
http://www.garret.ru/diag_20201217_102056.stacks_59644
(this file is 6Mb so I have not attached it to this mail).

What I have see in this stack traces is that 642 backends and blocked in
LWLockAcquire,
mostly in obtaining shared buffer lock:

#0  0x00007f0e7fe7a087 in semop () from /lib64/libc.so.6
#1  0x0000000000682fb1 in PGSemaphoreLock
(sema=sema(at)entry=0x7f0e1c1f63a0) at pg_sema.c:387
#2  0x00000000006ed60b in LWLockAcquire (lock=lock(at)entry=0x7e8b6176d800,
mode=mode(at)entry=LW_SHARED) at lwlock.c:1338
#3  0x00000000006c88a7 in BufferAlloc (foundPtr=0x7ffcc3c8de9b "\001",
strategy=0x0, blockNum=997, forkNum=MAIN_FORKNUM, relpersistence=112
'p', smgr=0x2fb2df8) at bufmgr.c:1177
#4  ReadBuffer_common (smgr=0x2fb2df8, relpersistence=<optimized out>,
relkind=<optimized out>, forkNum=forkNum(at)entry=MAIN_FORKNUM,
blockNum=blockNum(at)entry=997, mode=RBM_NORMAL, strategy=0x0,
hit=hit(at)entry=0x7ffcc3c8df97 "") at bufmgr.c:894
#5  0x00000000006c928b in ReadBufferExtended (reln=0x32c7ed0,
forkNum=forkNum(at)entry=MAIN_FORKNUM, blockNum=997,
mode=mode(at)entry=RBM_NORMAL, strategy=strategy(at)entry=0x0) at bufmgr.c:753
#6  0x00000000006c93ab in ReadBuffer (blockNum=<optimized out>,
reln=<optimized out>) at bufmgr.c:685
...

Only 11 locks from this 642 are unique.
Moreover: 358 backends are waiting for one lock and 183 - for another.

There are two backends (pids 291121 and 285927) which are trying to
obtain exclusive lock while already holding another exclusive lock.
And them block all other backends.

This is single place in bufmgr (and in postgres) where process tries to
lock two buffers:

        /*
         * To change the association of a valid buffer, we'll need to have
         * exclusive lock on both the old and new mapping partitions.
         */
        if (oldFlags & BM_TAG_VALID)
        {
            ...
            /*
             * Must lock the lower-numbered partition first to avoid
             * deadlocks.
             */
            if (oldPartitionLock < newPartitionLock)
            {
                LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
                LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
            }
            else if (oldPartitionLock > newPartitionLock)
            {
                LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
                LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
            }

This two backends are blocked in the second lock request.
I read all connects in bufmgr.c and README file but didn't find
explanation why do we need to lock both partitions.
Why it is not possible first free old buffer (as it is done in
InvalidateBuffer) and then repeat attempt to allocate the buffer?

Yes, it may require more efforts than just "gabbing" the buffer.
But in this case there is no need to keep two locks.

I wonder if somebody in the past  faced with the similar symptoms and
was this problem with holding locks of two partitions in bufmgr already
discussed?

P.S.
The customer is using 9.6 version of Postgres, but I have checked that
the same code fragment is present in the master.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2020-12-18 13:11:03 Re: Single transaction in the tablesync worker?
Previous Message Laurenz Albe 2020-12-18 11:43:07 Re: allow to \dtS+ pg_toast.*