REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned
Date: 2019-02-16 02:38:54
Message-ID: 20190216023854.GF30291@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I saw this error once last week while stress testing to reproduce earlier bugs,
but tentatively thought it was a downstream symptom of those bugs (since
fixed), and now wanted to check that #15585 and others were no longer
reproducible. Unfortunately I got this error while running same test case [2]
as for previous bug ('could not attach').

2019-02-14 23:40:41.611 MST [32287] ERROR: cannot unpin a segment that is not pinned

On commit faf132449c0cafd31fe9f14bbf29ca0318a89058 (REL_11_STABLE including
both of last week's post-11.2 DSA patches), I reproduced twice, once within
~2.5 hours, once within 30min.

I'm not able to reproduce on master running overnight and now 16+hours.

See also:

pg11.1: dsa_area could not attach to segment - resolved by commit 6c0fb941892519ad6b8873e99c4001404fb9a128
[1] https://www.postgresql.org/message-id/20181231221734.GB25379%40telsasoft.com
pg11.1: dsa_area could not attach to segment:
[2] https://www.postgresql.org/message-id/20190211040132.GV31721%40telsasoft.com
BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query
[3] https://www.postgresql.org/message-id/flat/15585-324ff6a93a18da46%40postgresql.org
dsa_allocate() faliure - resolved by commit 7215efdc005e694ec93678a6203dbfc714d12809 (also doesn't affect master)
[4] https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com

#0 0x00007f96a589e277 in raise () from /lib64/libc.so.6
#1 0x00007f96a589f968 in abort () from /lib64/libc.so.6
#2 0x000000000088b6f7 in errfinish (dummy=dummy(at)entry=0) at elog.c:555
#3 0x000000000088eee8 in elog_finish (elevel=elevel(at)entry=22, fmt=fmt(at)entry=0xa52cb0 "cannot unpin a segment that is not pinned") at elog.c:1376
#4 0x00000000007578b2 in dsm_unpin_segment (handle=1780672242) at dsm.c:914
#5 0x00000000008aee15 in dsa_release_in_place (place=0x7f96a6991640) at dsa.c:618
#6 0x00000000007571a9 in dsm_detach (seg=0x2470f78) at dsm.c:731
#7 0x0000000000509233 in DestroyParallelContext (pcxt=0x24c18c0) at parallel.c:900
#8 0x000000000062db65 in ExecParallelCleanup (pei=0x25aacf8) at execParallel.c:1154
#9 0x0000000000640588 in ExecShutdownGather (node=node(at)entry=0x2549b60) at nodeGather.c:406
#10 0x0000000000630208 in ExecShutdownNode (node=0x2549b60) at execProcnode.c:767
#11 0x000000000067724f in planstate_tree_walker (planstate=planstate(at)entry=0x2549a48, walker=walker(at)entry=0x6301c0 <ExecShutdownNode>, context=context(at)entry=0x0) at nodeFuncs.c:3739
#12 0x00000000006301dd in ExecShutdownNode (node=0x2549a48) at execProcnode.c:749
#13 0x000000000067724f in planstate_tree_walker (planstate=planstate(at)entry=0x25495d0, walker=walker(at)entry=0x6301c0 <ExecShutdownNode>, context=context(at)entry=0x0) at nodeFuncs.c:3739
#14 0x00000000006301dd in ExecShutdownNode (node=0x25495d0) at execProcnode.c:749
#15 0x000000000067724f in planstate_tree_walker (planstate=planstate(at)entry=0x25494b8, walker=walker(at)entry=0x6301c0 <ExecShutdownNode>, context=context(at)entry=0x0) at nodeFuncs.c:3739
#16 0x00000000006301dd in ExecShutdownNode (node=0x25494b8) at execProcnode.c:749
#17 0x000000000067724f in planstate_tree_walker (planstate=planstate(at)entry=0x25492a8, walker=walker(at)entry=0x6301c0 <ExecShutdownNode>, context=context(at)entry=0x0) at nodeFuncs.c:3739
#18 0x00000000006301dd in ExecShutdownNode (node=node(at)entry=0x25492a8) at execProcnode.c:749
#19 0x0000000000628fbd in ExecutePlan (execute_once=<optimized out>, dest=0xd96e60 <donothingDR>, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x25492a8, estate=0x2549038) at execMain.c:1787
#20 standard_ExecutorRun (queryDesc=0x2576990, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#21 0x00000000005c635f in ExplainOnePlan (plannedstmt=plannedstmt(at)entry=0x2579820, into=into(at)entry=0x0, es=es(at)entry=0x2529170,
queryString=queryString(at)entry=0x243f348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., params=params(at)entry=0x0, queryEnv=queryEnv(at)entry=0x0, planduration=planduration(at)entry=0x7fff1048afa0) at explain.c:535
#22 0x00000000005c665f in ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=0x0, es=0x2529170,
queryString=0x243f348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., params=0x0, queryEnv=0x0) at explain.c:371
#23 0x00000000005c6bbe in ExplainQuery (pstate=pstate(at)entry=0x2461bd8, stmt=stmt(at)entry=0x24f87a8,
queryString=queryString(at)entry=0x243f348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., params=params(at)entry=0x0, queryEnv=queryEnv(at)entry=0x0, dest=dest(at)entry=0x2461b40) at explain.c:254
#24 0x0000000000782a5d in standard_ProcessUtility (pstmt=0x24f8930,
queryString=0x243f348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x2461b40, completionTag=0x7fff1048b160 "") at utility.c:675
#25 0x000000000077fcc6 in PortalRunUtility (portal=0x24a4bd8, pstmt=0x24f8930, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=0x2461b40, completionTag=0x7fff1048b160 "") at pquery.c:1178
#26 0x0000000000780b03 in FillPortalStore (portal=portal(at)entry=0x24a4bd8, isTopLevel=isTopLevel(at)entry=true) at pquery.c:1038
#27 0x0000000000781638 in PortalRun (portal=portal(at)entry=0x24a4bd8, count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=true, run_once=run_once(at)entry=true, dest=dest(at)entry=0x2512490,
#28 0x000000000077d27f in exec_simple_query (
query_string=0x243f348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "...) at postgres.c:1145
#29 0x000000000077e5b2 in PostgresMain (argc=<optimized out>, argv=argv(at)entry=0x24691b8, dbname=0x24690b0 "postgres", username=<optimized out>) at postgres.c:4182
#30 0x000000000047cdf3 in BackendRun (port=0x24613a0) at postmaster.c:4361
#31 BackendStartup (port=0x24613a0) at postmaster.c:4033
#32 ServerLoop () at postmaster.c:1706
#33 0x0000000000706449 in PostmasterMain (argc=argc(at)entry=20, argv=argv(at)entry=0x2439ba0) at postmaster.c:1379
#34 0x000000000047ddd1 in main (argc=20, argv=0x2439ba0) at main.c:228

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-16 02:45:02 Re: Refactoring the checkpointer's fsync request queue
Previous Message Amit Kapila 2019-02-16 02:36:18 Re: 2019-03 CF Summary / Review - Tranche #1