Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7)

From: chenhj <chjischj(at)163(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7)
Date: 2019-09-29 08:16:28
Message-ID: 5c332bd1.87b6.16d7c17aa98.Coremail.chjischj@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,all

In our PostgreSQL 10.7(rhel 6.3) database, autovacuum process and many insert processes blocked in gin index's LWLock:buffer_content for long time.

In other words, the following gin index lwlock deadlock phenomenon has occurred again. Since the following bug in 10.7 has been fixed. So this should be a new bug.

https://www.postgresql.org/message-id/flat/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com

We have already obtained coredump files of autovacuum process and one of insert processes.
Unfortunately the insert process(run by gcore) held no lwlock, it should be another process(we did not fetch core file) that hold the lwlock needed for autovacuum process.

the stack is as following:

## stack of one insert process: Acquire lock 0x7f6c517dbfa4 which was held by vacuum process
----------------------------------------------------------------------------
(gdb) bt
#0 0x000000369ea0da00 in sem_wait () from /lib64/libpthread.so.0
#1 0x00000000006a7910 in PGSemaphoreLock (sema=0x7f6c4f76a7b8) at pg_sema.c:316
#2 0x0000000000718225 in LWLockAcquire (lock=0x7f6c517dbfa4, mode=LW_SHARED) at lwlock.c:1233
#3 0x000000000048b622 in ginTraverseLock (buffer=224225, searchMode=0 '\000') at ginbtree.c:40
#4 0x000000000048ca13 in ginFindLeafPage (btree=0x7fffc71c4ea0, searchMode=0 '\000', snapshot=0x0) at ginbtree.c:97
#5 0x00000000004894db in ginInsertItemPointers (index=<value optimized out>, rootBlkno=<value optimized out>, items=<value optimized out>, nitem=<value optimized out>, buildStats=0x0)
at gindatapage.c:1909
#6 0x00000000004863a7 in ginEntryInsert (ginstate=0x1c72158, attnum=1, key=20190913, category=0 '\000', items=0x1c81508, nitem=72, buildStats=0x0) at gininsert.c:214
#7 0x000000000049219a in ginInsertCleanup (ginstate=0x1c72158, full_clean=0 '\000', fill_fsm=1 '\001', forceCleanup=<value optimized out>, stats=<value optimized out>) at ginfast.c:878
#8 0x000000000049308e in ginHeapTupleFastInsert (ginstate=0x1c72158, collector=<value optimized out>) at ginfast.c:443
#9 0x0000000000486749 in gininsert (index=<value optimized out>, values=0x7fffc71c54f0, isnull=0x7fffc71c5600 "", ht_ctid=0x1c6d3a4, heapRel=<value optimized out>,
checkUnique=<value optimized out>, indexInfo=0x1c61da8) at gininsert.c:522
#10 0x00000000005f75f0 in ExecInsertIndexTuples (slot=0x1c62168, tupleid=0x1c6d3a4, estate=0x1c61768, noDupErr=0 '\000', specConflict=0x0, arbiterIndexes=0x0) at execIndexing.c:387
#11 0x0000000000616497 in ExecInsert (pstate=0x1c61ab8) at nodeModifyTable.c:519
#12 ExecModifyTable (pstate=0x1c61ab8) at nodeModifyTable.c:1779
#13 0x00000000005fb6bf in ExecProcNode (queryDesc=0x1c67760, direction=<value optimized out>, count=0, execute_once=-72 '\270') at ../../../src/include/executor/executor.h:250
#14 ExecutePlan (queryDesc=0x1c67760, direction=<value optimized out>, count=0, execute_once=-72 '\270') at execMain.c:1723
#15 standard_ExecutorRun (queryDesc=0x1c67760, direction=<value optimized out>, count=0, execute_once=-72 '\270') at execMain.c:364
#16 0x00007f6e226aa6f8 in pgss_ExecutorRun (queryDesc=0x1c67760, direction=ForwardScanDirection, count=0, execute_once=1 '\001') at pg_stat_statements.c:889
#17 0x00007f6e224a474d in explain_ExecutorRun (queryDesc=0x1c67760, direction=ForwardScanDirection, count=0, execute_once=1 '\001') at auto_explain.c:267
#18 0x000000000072a15b in ProcessQuery (plan=<value optimized out>,
sourceText=0x1c21458 "INSERT INTO bi_dm.tdm_wh_shopgds_fnsh_rt (STATIS_DATE,SITE_CD,LGORT,ZSIZE,ZVTWEG,VSBED,TOTAL_CNT,FNSH_CNT,UNFNSH_CNT,ETL_TIME,DEPT_CD,TMALL_FLG,BUSS_TP,ZCKYWLX) VALUES($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$"..., params=0x1c21580, queryEnv=0x0, dest=<value optimized out>, completionTag=0x7fffc71c5de0 "") at pquery.c:161
#19 0x000000000072a395 in PortalRunMulti (portal=0x1c57f18, isTopLevel=1 '\001', setHoldSnapshot=0 '\000', dest=0xc9b480, altdest=0xc9b480, completionTag=0x7fffc71c5de0 "") at pquery.c:1286
#20 0x000000000072aa98 in PortalRun (portal=0x1c57f18, count=1, isTopLevel=1 '\001', run_once=1 '\001', dest=0x1c25768, altdest=0x1c25768, completionTag=0x7fffc71c5de0 "") at pquery.c:799
#21 0x0000000000728c9a in exec_execute_message (argc=<value optimized out>, argv=<value optimized out>, dbname=0x1bbb800 "lbiwhdb", username=<value optimized out>) at postgres.c:2007
#22 PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x1bbb800 "lbiwhdb", username=<value optimized out>) at postgres.c:4180
#23 0x00000000006bb43a in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4405
---Type <return> to continue, or q <return> to quit---
#24 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4077
#25 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1755
#26 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1363
#27 0x000000000063b4d0 in main (argc=3, argv=0x1b839e0) at main.c:228
(gdb) f 2
#2 0x0000000000718225 in LWLockAcquire (lock=0x7f6c517dbfa4, mode=LW_SHARED) at lwlock.c:1233
1233lwlock.c: No such file or directory.
in lwlock.c
(gdb) p num_held_lwlocks
$1 = 0
(gdb)

## stack of autovacuum:Acquire lock 0x7f6c519ba5a4 and hold 0x7f6c517dbfa4, 0x7f6c51684f64
--------------------------------------
(gdb) bt
#0 0x000000369ea0da00 in sem_wait () from /lib64/libpthread.so.0
#1 0x00000000006a7910 in PGSemaphoreLock (sema=0x7f6c4f77fdb8) at pg_sema.c:316
#2 0x0000000000718225 in LWLockAcquire (lock=0x7f6c519ba5a4, mode=LW_EXCLUSIVE) at lwlock.c:1233
#3 0x00000000004900fb in ginDeletePage (gvs=0x7fffc71c26d0, blkno=3709, isRoot=0 '\000', parent=<value optimized out>, myoff=2) at ginvacuum.c:154
#4 ginScanToDelete (gvs=0x7fffc71c26d0, blkno=3709, isRoot=0 '\000', parent=<value optimized out>, myoff=2) at ginvacuum.c:297
#5 0x000000000049004c in ginScanToDelete (gvs=0x7fffc71c26d0, blkno=6418, isRoot=1 '\001', parent=<value optimized out>, myoff=0) at ginvacuum.c:281
#6 0x0000000000490d1f in ginVacuumPostingTree (info=0x7fffc71c4da0, stats=<value optimized out>, callback=<value optimized out>, callback_state=<value optimized out>) at ginvacuum.c:408
#7 ginbulkdelete (info=0x7fffc71c4da0, stats=<value optimized out>, callback=<value optimized out>, callback_state=<value optimized out>) at ginvacuum.c:643
#8 0x00000000005e9d05 in lazy_vacuum_index (indrel=0x7f6c4f396fe8, stats=0x1c25ad8, vacrelstats=0x1c25578) at vacuumlazy.c:1621
#9 0x00000000005eaa29 in lazy_scan_heap (onerel=<value optimized out>, options=<value optimized out>, params=0x1ca7550, bstrategy=<value optimized out>) at vacuumlazy.c:1311
#10 lazy_vacuum_rel (onerel=<value optimized out>, options=<value optimized out>, params=0x1ca7550, bstrategy=<value optimized out>) at vacuumlazy.c:258
#11 0x00000000005e8d68 in vacuum_rel (relid=176178, relation=<value optimized out>, options=97, params=0x1ca7550) at vacuum.c:1445
#12 0x00000000005e9187 in vacuum (options=97, relation=0x7fffc71c5640, relid=<value optimized out>, params=0x1ca7550, va_cols=0x0, bstrategy=<value optimized out>, isTopLevel=1 '\001')
at vacuum.c:306
#13 0x00000000006ab262 in autovacuum_do_vac_analyze () at autovacuum.c:3135
#14 do_autovacuum () at autovacuum.c:2490
#15 0x00000000006aba70 in AutoVacWorkerMain (argc=<value optimized out>, argv=<value optimized out>) at autovacuum.c:1707
#16 0x00000000006abb46 in StartAutoVacWorker () at autovacuum.c:1504
#17 0x00000000006b972a in StartAutovacuumWorker (postgres_signal_arg=<value optimized out>) at postmaster.c:5462
#18 sigusr1_handler (postgres_signal_arg=<value optimized out>) at postmaster.c:5159
#19 <signal handler called>
#20 0x000000369e2e1393 in __select_nocancel () from /lib64/libc.so.6
#21 0x00000000006baa66 in ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1719
#22 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1363
#23 0x000000000063b4d0 in main (argc=3, argv=0x1b839e0) at main.c:228
(gdb) f 2
#2 0x0000000000718225 in LWLockAcquire (lock=0x7f6c519ba5a4, mode=LW_EXCLUSIVE) at lwlock.c:1233
1233lwlock.c: No such file or directory.
in lwlock.c
(gdb) p num_held_lwlocks
$1 = 2
(gdb) p InterruptHoldoffCount
$2 = 3
(gdb) p held_lwlocks
$3 = {{lock = 0x7f6c517dbfa4, mode = LW_EXCLUSIVE}, {lock = 0x7f6c51684f64, mode = LW_EXCLUSIVE}, {lock = 0x7f6c4f782a00, mode = LW_SHARED}, {lock = 0x7f6c4f78df80, mode = LW_EXCLUSIVE}, {
lock = 0x7f6c51a7a1e4, mode = LW_EXCLUSIVE}, {lock = 0x7f6c4f78df80, mode = LW_EXCLUSIVE}, {lock = 0x7f6c53a7eb64, mode = LW_EXCLUSIVE}, {lock = 0x7f6c4f78df80, mode = LW_EXCLUSIVE}, {
lock = 0x0, mode = LW_EXCLUSIVE} <repeats 192 times>}
(gdb)
(gdb) f 4
#4 ginScanToDelete (gvs=0x7fffc71c26d0, blkno=3709, isRoot=0 '\000', parent=<value optimized out>, myoff=2) at ginvacuum.c:297
297in ginvacuum.c
(gdb) p *me
$15 = {child = 0x0, parent = 0x7fffc71c26a0, blkno = 0, leftBlkno = 5649, isRoot = 0 '\000'}
(gdb) p *(&BufferDescriptors[(buffer - 1)].bufferdesc)
$3 = {tag = {rnode = {spcNode = 1663, dbNode = 16397, relNode = 692113}, forkNum = MAIN_FORKNUM, blockNum = 3709}, buf_id = 202271, state = {value = 3549691907}, wait_backend_pid = 0,
freeNext = -2, content_lock = {tranche = 54, state = {value = 1627389952}, waiters = {head = 539, tail = 1223}}}
(gdb) p &((&BufferDescriptors[(buffer - 1)].bufferdesc)->content_lock)
$5 = (LWLock *) 0x7f6c51684f64

(gdb) f 5
#5 0x000000000049004c in ginScanToDelete (gvs=0x7fffc71c26d0, blkno=6418, isRoot=1 '\001', parent=<value optimized out>, myoff=0) at ginvacuum.c:281
281in ginvacuum.c
(gdb) p *(&BufferDescriptors[(buffer - 1)].bufferdesc)
$6 = {tag = {rnode = {spcNode = 1663, dbNode = 16397, relNode = 692113}, forkNum = MAIN_FORKNUM, blockNum = 6418}, buf_id = 224224, state = {value = 3549692473}, wait_backend_pid = 14261,
freeNext = -2, content_lock = {tranche = 54, state = {value = 1627389952}, waiters = {head = 527, tail = 370}}}
(gdb) p &((&BufferDescriptors[(buffer - 1)].bufferdesc)->content_lock)
$7 = (LWLock *) 0x7f6c517dbfa4

According to above information in core file and source code, autovacuum process is trying to lock the block 5649,and be blocked.

6418(1. held LWLock:0x7f6c517dbfa4) root
|
5649(3. Acquire LWLock:0x7f6c519ba5a4) --> 3709(2.held LWLock:0x7f6c51684f64)

Does the locking order of autovacuum process(root->right->left) correct? While insert process lock gin buffer by order of bottom->top and left->right.

1. vacuum(root->right->left):
---------------------------------------------------------------------
static void
ginVacuumPostingTree(GinVacuumState *gvs, BlockNumber rootBlkno)
{
if (ginVacuumPostingTreeLeaves(gvs, rootBlkno))
{
...
LockBufferForCleanup(buffer);//1. (1*)lock the root
...
ginScanToDelete(gvs, rootBlkno, true, &root, InvalidOffsetNumber);//2. (2)scan to delete for root
}
}

static bool
ginScanToDelete(GinVacuumState *gvs, BlockNumber blkno, bool isRoot,
DataPageDeleteStack *parent, OffsetNumber myoff)
{
...
if (!isRoot)
LockBuffer(buffer, GIN_EXCLUSIVE);//2.1.1 (4)lock the first child(left);
//2.2.1 (7*)lock the second child(right);
...

if (!GinPageIsLeaf(page))
{
...
for (i = FirstOffsetNumber; i <= GinPageGetOpaque(page)->maxoff; i++)
{
PostingItem *pitem = GinDataPageGetPostingItem(page, i);

if (ginScanToDelete(gvs, PostingItemGetBlockNumber(pitem), FALSE, me, i))// 2.1 (3)scan to delete for the first child(left);
// 2.2 (6)scan to delete for the second child(right);
i--;
}
}
...
if (isempty)
{
/* we never delete the left- or rightmost branch */
if (me->leftBlkno != InvalidBlockNumber && !GinPageRightMost(page))
{
Assert(!isRoot);
ginDeletePage(gvs, blkno, me->leftBlkno, me->parent->blkno, myoff, me->parent->isRoot);//2.2.2 (8)delete the second child(right)
meDelete = TRUE;
}
}

if (!isRoot)
LockBuffer(buffer, GIN_UNLOCK);//2.1.2 (5)unlock the first child(left)
...
}

static void
ginDeletePage(GinVacuumState *gvs, BlockNumber deleteBlkno, BlockNumber leftBlkno,
BlockNumber parentBlkno, OffsetNumber myoff, bool isParentRoot)
{
...
lBuffer = ReadBufferExtended(gvs->index, MAIN_FORKNUM, leftBlkno,
RBM_NORMAL, gvs->strategy);
...
LockBuffer(lBuffer, GIN_EXCLUSIVE);// 2.2.2.1 (9*)lock the first child(left);
...
}
---------------------------------------------------------------------

2. insert(bottom->top,left->right):

https://www.postgresql.org/message-id/19e4290b.3bb.166ea2c6831.Coremail.chjischj%40163.com
> # insert process(ginInsertValue())
>
>
>
> 644(root blkno)
> |
> 7054(2. held LWLock:0x2aaac587ae64) ----rightlink----> xxxx(3. Acquire LWLock:0x2aaab4009564,buffer = 2119038,blkno should be 9954)
> |
> 701(1. held LWLock:0x2aaab670dfe4)
>
>
> The ginInsertValue() function above gets the lwlock in the order described in the README.
>
>
> src/backend/access/gin/README
> ---------------------------------------------------------------
> To avoid deadlocks, B-tree pages must always be locked in the same order:
> left to right, and bottom to top.
> ...
> -----------------------------------------------------------------

Regards,
Chen Huajun

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2019-09-29 10:46:49 Re: Possible bug: SQL function parameter in window frame definition
Previous Message chenhj 2019-09-29 07:20:54 Re:Re: could not access status of transaction