Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned
Date: 2019-02-17 20:07:00
Message-ID: 20190217200700.GB28750@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 17, 2019 at 01:41:45PM -0600, Justin Pryzby wrote:
> On Sat, Feb 16, 2019 at 09:16:01PM +1300, Thomas Munro wrote:
> > On Sat, Feb 16, 2019 at 5:31 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > Thanks, will leave it spinning overnight.
>
> No errors in ~36 hours (126 CPU-hrs), so that seems to work. Thanks.

Actually...

On killing the postmaster having completed this stress test, one of the
backends was left running and didn't die on its own. It did die gracefully
when I killed the backend or the client.

I was able to repeat the result, on first try, but took numerous attempts to
repeat the 2nd and 3rd time to save pg_stat_activity.

Is there some issue regarding dsm_postmaster_shutdown ?

[pryzbyj(at)database postgresql]$ ps -wwf --ppid 31656
UID PID PPID C STIME TTY TIME CMD
pryzbyj 4512 31656 1 13:00 ? 00:00:00 postgres: pryzbyj postgres [local] EXPLAIN
pryzbyj 31657 31656 0 12:59 ? 00:00:00 postgres: logger
pryzbyj 31659 31656 0 12:59 ? 00:00:00 postgres: checkpointer
pryzbyj 31662 31656 0 12:59 ? 00:00:00 postgres: stats collector
pryzbyj 31785 31656 0 12:59 ? 00:00:00 postgres: pryzbyj postgres [local] idle

datid | 13285
datname | postgres
pid | 4512
usesysid | 10
usename | pryzbyj
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2019-02-17 13:00:50.79285-07
xact_start | 2019-02-17 13:00:50.797711-07
query_start | 2019-02-17 13:00:50.797711-07
state_change | 2019-02-17 13:00:50.797713-07
wait_event_type | IPC
wait_event | ExecuteGather
state | active
backend_xid |
backend_xmin | 1569
query | explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS typ
es FROM queued_alters qa JOIN pg_attribute colpar ON to_regclass(qa.parent)=colpar.attrelid AND colpar.attnum>0 AND NOT colpar.attisdropped JOIN (SELECT *, attrelid::regclass::text AS child FROM pg_attribute) colcld
ON to_regclass(qa.child) =colcld.attrelid AND colcld.attnum>0 AND NOT colcld.attisdropped WHERE colcld.attname=colpar.attname AND colpar.atttypid!=colcld.atttypid GROUP BY 1,2 ORDER BY parent LIKE 'unused%', regexp_r
eplace(colcld.child, '.*_((([0-9]{4}_[0-9]{2})_[0-9]{2})|(([0-9]{6})([0-9]{2})?))$', '\3\5') DESC, regexp_replace(colcld.child, '.*_', '') DESC LIMIT 1
backend_type | client backend

#0 0x00007fe131637163 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1 0x0000000000758d26 in WaitEventSetWaitBlock (nevents=1, occurred_events=0x7ffde16775a0, cur_timeout=-1, set=0x7fe132640e50) at latch.c:1048
#2 WaitEventSetWait (set=set(at)entry=0x7fe132640e50, timeout=timeout(at)entry=-1, occurred_events=occurred_events(at)entry=0x7ffde16775a0, nevents=nevents(at)entry=1, wait_event_info=wait_event_info(at)entry=134217731)
at latch.c:1000
#3 0x00000000007591c2 in WaitLatchOrSocket (latch=0x7fe12a7591b4, wakeEvents=wakeEvents(at)entry=1, sock=sock(at)entry=-1, timeout=-1, timeout(at)entry=0, wait_event_info=wait_event_info(at)entry=134217731) at latch.c:385
#4 0x00000000007592a0 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents(at)entry=1, timeout=timeout(at)entry=0, wait_event_info=wait_event_info(at)entry=134217731) at latch.c:339
#5 0x00000000006401e2 in gather_readnext (gatherstate=<optimized out>) at nodeGather.c:367
#6 gather_getnext (gatherstate=0x2af1f70) at nodeGather.c:256
#7 ExecGather (pstate=0x2af1f70) at nodeGather.c:207
#8 0x0000000000630188 in ExecProcNodeInstr (node=0x2af1f70) at execProcnode.c:461
#9 0x0000000000653506 in ExecProcNode (node=0x2af1f70) at ../../../src/include/executor/executor.h:247
#10 ExecSort (pstate=0x2af1e58) at nodeSort.c:107
#11 0x0000000000630188 in ExecProcNodeInstr (node=0x2af1e58) at execProcnode.c:461
#12 0x0000000000638a89 in ExecProcNode (node=0x2af1e58) at ../../../src/include/executor/executor.h:247
#13 fetch_input_tuple (aggstate=aggstate(at)entry=0x2af19e0) at nodeAgg.c:406
#14 0x000000000063a6b0 in agg_retrieve_direct (aggstate=0x2af19e0) at nodeAgg.c:1740
#15 ExecAgg (pstate=0x2af19e0) at nodeAgg.c:1555
#16 0x0000000000630188 in ExecProcNodeInstr (node=0x2af19e0) at execProcnode.c:461
#17 0x0000000000653506 in ExecProcNode (node=0x2af19e0) at ../../../src/include/executor/executor.h:247
#18 ExecSort (pstate=0x2af18c8) at nodeSort.c:107
#19 0x0000000000630188 in ExecProcNodeInstr (node=0x2af18c8) at execProcnode.c:461
#20 0x00000000006498e1 in ExecProcNode (node=0x2af18c8) at ../../../src/include/executor/executor.h:247
#21 ExecLimit (pstate=0x2af16b8) at nodeLimit.c:95
#22 0x0000000000630188 in ExecProcNodeInstr (node=0x2af16b8) at execProcnode.c:461
#23 0x0000000000628eda in ExecProcNode (node=0x2af16b8) at ../../../src/include/executor/executor.h:247
#24 ExecutePlan (execute_once=<optimized out>, dest=0xd96e60 <donothingDR>, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x2af16b8,
estate=0x2af1448) at execMain.c:1723
#25 standard_ExecutorRun (queryDesc=0x2b1eda0, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#26 0x00000000005c635f in ExplainOnePlan (plannedstmt=plannedstmt(at)entry=0x2b21c30, into=into(at)entry=0x0, es=es(at)entry=0x2ad1580,
queryString=queryString(at)entry=0x29e7348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., params=params(at)entry=0x0, queryEnv=queryEnv(at)entry=0x0, planduration=planduration(at)entry=0x7ffde1677970) at explain.c:535
#27 0x00000000005c665f in ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=0x0, es=0x2ad1580,
queryString=0x29e7348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., params=0x0, queryEnv=0x0) at explain.c:371
#28 0x00000000005c6bbe in ExplainQuery (pstate=pstate(at)entry=0x2a09bd8, stmt=stmt(at)entry=0x2aa0bb8,
queryString=queryString(at)entry=0x29e7348 "explain analyze SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types "..., params=params(at)entry=0x0, queryEnv=queryEnv(at)entry=0x0, dest=dest(at)entry=0x2a09b40) at explain.c:254
#29 0x0000000000782a1d in standard_ProcessUtility (pstmt=0x2aa0d40,

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-02-17 20:26:53 Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned
Previous Message Michael Banck 2019-02-17 20:00:29 Re: Progress reporting for pg_verify_checksums