From: | Andreas Seltenreich <seltenreich(at)gmx(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | [sqlsmith] Crash in gather_readnext |
Date: | 2016-12-05 20:07:58 |
Message-ID: | 87k2be16n5.fsf@credativ.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
on master as of a0ae54d, there's a 1 in 10e6 chance sqlsmith catches
gather_readnext reading beyond the gatherstate->readers array with
readers[gatherstate->readnext]. Sample backtrace below.
As readnext is never explicitly initialized, I think what happens is
that a rescan gets less workers than the initial scan, and the dangling
readnext points outside the array. I'm no longer seeing these crashes
when explicitly initializing readnext to 0 like in the attached patch.
regards,
Andreas
Program terminated with signal SIGSEGV, Segmentation fault.
#0 shm_mq_receive (mqh=0x259, nbytesp=nbytesp(at)entry=0x7ffc55ce0580, datap=datap(at)entry=0x7ffc55ce0588, nowait=nowait(at)entry=1 '\001') at shm_mq.c:520
520 shm_mq *mq = mqh->mqh_queue;
(gdb) bt
#0 shm_mq_receive (mqh=0x259, nbytesp=nbytesp(at)entry=0x7ffc55ce0580, datap=datap(at)entry=0x7ffc55ce0588, nowait=nowait(at)entry=1 '\001') at shm_mq.c:520
#1 0x000000000060b8b7 in TupleQueueReaderNext (reader=reader(at)entry=0x5446c10, nowait=nowait(at)entry=1 '\001', done=done(at)entry=0x7ffc55ce065b "") at tqueue.c:692
#2 0x00000000005f5e03 in gather_readnext (gatherstate=0x52a9918) at nodeGather.c:339
#3 gather_getnext (gatherstate=0x52a9918) at nodeGather.c:292
#4 ExecGather (node=node(at)entry=0x52a9918) at nodeGather.c:233
#5 0x00000000005e3b68 in ExecProcNode (node=0x52a9918) at execProcnode.c:515
#6 0x00000000005eb2f2 in ExecScanFetch (recheckMtd=0x605e40 <SubqueryRecheck>, accessMtd=0x605e50 <SubqueryNext>, node=0x52a86c0) at execScan.c:95
#7 ExecScan (node=node(at)entry=0x52a86c0, accessMtd=accessMtd(at)entry=0x605e50 <SubqueryNext>, recheckMtd=recheckMtd(at)entry=0x605e40 <SubqueryRecheck>) at execScan.c:180
#8 0x0000000000605e6f in ExecSubqueryScan (node=node(at)entry=0x52a86c0) at nodeSubqueryscan.c:85
#9 0x00000000005e3c68 in ExecProcNode (node=node(at)entry=0x52a86c0) at execProcnode.c:445
#10 0x00000000006001d6 in ExecNestLoop (node=node(at)entry=0x52a7978) at nodeNestloop.c:123
#11 0x00000000005e3bf8 in ExecProcNode (node=node(at)entry=0x52a7978) at execProcnode.c:476
#12 0x00000000006001d6 in ExecNestLoop (node=node(at)entry=0x52a5120) at nodeNestloop.c:123
#13 0x00000000005e3bf8 in ExecProcNode (node=node(at)entry=0x52a5120) at execProcnode.c:476
#14 0x00000000006001d6 in ExecNestLoop (node=node(at)entry=0x52a3d50) at nodeNestloop.c:123
#15 0x00000000005e3bf8 in ExecProcNode (node=0x52a3d50) at execProcnode.c:476
#16 0x00000000006015e5 in ExecResult (node=node(at)entry=0x52a3140) at nodeResult.c:130
#17 0x00000000005e3d18 in ExecProcNode (node=node(at)entry=0x52a3140) at execProcnode.c:392
#18 0x00000000005fb360 in ExecLimit (node=node(at)entry=0x52a2e70) at nodeLimit.c:91
#19 0x00000000005e3af8 in ExecProcNode (node=node(at)entry=0x52a2e70) at execProcnode.c:531
#20 0x0000000000600299 in ExecNestLoop (node=node(at)entry=0x52a1a10) at nodeNestloop.c:174
#21 0x00000000005e3bf8 in ExecProcNode (node=node(at)entry=0x52a1a10) at execProcnode.c:476
#22 0x00000000006001d6 in ExecNestLoop (node=node(at)entry=0x52a16d0) at nodeNestloop.c:123
#23 0x00000000005e3bf8 in ExecProcNode (node=node(at)entry=0x52a16d0) at execProcnode.c:476
#24 0x00000000005dfdae in ExecutePlan (dest=0x50cbb00, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x52a16d0, estate=0x3610968) at execMain.c:1567
#25 standard_ExecutorRun (queryDesc=0x36805b8, direction=<optimized out>, count=0) at execMain.c:338
#26 0x0000000000701a58 in PortalRunSelect (portal=portal(at)entry=0x529da38, forward=forward(at)entry=1 '\001', count=0, count(at)entry=9223372036854775807, dest=dest(at)entry=0x50cbb00) at pquery.c:946
#27 0x000000000070300e in PortalRun (portal=portal(at)entry=0x529da38, count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=1 '\001', dest=dest(at)entry=0x50cbb00, altdest=altdest(at)entry=0x50cbb00, completionTag=completionTag(at)entry=0x7ffc55ce0ed0 "") at pquery.c:787
#28 0x0000000000700869 in exec_simple_query (query_string=0x45d3028 "select ...") at postgres.c:1094
#29 PostgresMain (argc=<optimized out>, argv=argv(at)entry=0x23ce878, dbname=<optimized out>, username=<optimized out>) at postgres.c:4069
#30 0x000000000046d9d9 in BackendRun (port=0x23d1ad0) at postmaster.c:4271
#31 BackendStartup (port=0x23d1ad0) at postmaster.c:3945
#32 ServerLoop () at postmaster.c:1701
#33 0x0000000000698ed9 in PostmasterMain (argc=argc(at)entry=4, argv=argv(at)entry=0x23a05c0) at postmaster.c:1309
#34 0x000000000046ebbd in main (argc=4, argv=0x23a05c0) at main.c:228
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-potential-crash-on-ReScanGather.patch | text/x-diff | 1.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-12-05 20:08:58 | Typmod associated with multi-row VALUES constructs |
Previous Message | Mithun Cy | 2016-12-05 20:00:57 | Re: Cache Hash Index meta page. |