Re: [sqlsmith] Parallel worker executor crash on master

From: Andreas Seltenreich <seltenreich(at)gmx(dot)de>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [sqlsmith] Parallel worker executor crash on master
Date: 2017-12-16 23:26:45
Message-ID: 87d13etft6.fsf@ansel.ydns.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro writes:

> On Sat, Dec 16, 2017 at 10:13 PM, Andreas Seltenreich
> <seltenreich(at)gmx(dot)de> wrote:
>> Core was generated by `postgres: smith regression [local] SELECT '.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0 gather_getnext (gatherstate=0x555a5fff1350) at nodeGather.c:283
>> 283 estate->es_query_dsa = gatherstate->pei->area;
>> #1 ExecGather (pstate=0x555a5fff1350) at nodeGather.c:216
>
> Hmm, thanks. That's not good. Do we know if gatherstate->pei is
> NULL, or if it's somehow pointing to garbage?

It was NULL on all the coredumps I looked into. Below[1] is a full
gatherstate.

> Not sure how either of those things could happen, since we only set it
> to NULL in ExecShutdownGather() after which point we shouldn't call
> ExecGather() again, and any MemoryContext problems with pei should
> have caused problems already without this patch (for example in
> ExecParallelCleanup). Clearly I'm missing something.

FWIW, all backtraces collected so far are identical for the first nine
frames. After ExecProjectSet, they are pretty random executor innards.

,----
| #1 ExecGather at nodeGather.c:216
| #2 0x0000555bc9fb41ea in ExecProcNode at ../../../src/include/executor/executor.h:242
| #3 ExecutePlan at execMain.c:1718
| #4 standard_ExecutorRun at execMain.c:361
| #5 0x0000555bc9fc07cc in postquel_getnext at functions.c:865
| #6 fmgr_sql (fcinfo=0x555bcba07748) at functions.c:1161
| #7 0x0000555bc9fbc4f7 in ExecMakeFunctionResultSet at execSRF.c:604
| #8 0x0000555bc9fd7cbb in ExecProjectSRF at nodeProjectSet.c:175
| #9 0x0000560828dc8df5 in ExecProjectSet at nodeProjectSet.c:105
`----

regards,
Andreas

Footnotes:
[1]
(gdb) p *gatherstate
$3 = {
ps = {
type = T_GatherState,
plan = 0x555bcb9faf30,
state = 0x555bcba3d098,
ExecProcNode = 0x555bc9fc9e30 <ExecGather>,
ExecProcNodeReal = 0x555bc9fc9e30 <ExecGather>,
instrument = 0x0,
worker_instrument = 0x0,
qual = 0x0,
lefttree = 0x555bcba3d678,
righttree = 0x0,
initPlan = 0x0,
subPlan = 0x0,
chgParam = 0x0,
ps_ResultTupleSlot = 0x555bcba3d5b8,
ps_ExprContext = 0x555bcba3d3c8,
ps_ProjInfo = 0x0
},
initialized = 1 '\001',
need_to_scan_locally = 1 '\001',
tuples_needed = -1,
funnel_slot = 0x555bcba3d4c0,
pei = 0x0,
nworkers_launched = 0,
nreaders = 0,
nextreader = 0,
reader = 0x0
}

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-12-17 00:32:07 Re: pgsql: Provide overflow safe integer math inline functions.
Previous Message Thomas Munro 2017-12-16 20:30:21 Re: [sqlsmith] Parallel worker executor crash on master