segfault due to invalid cached plan

From: Nicolas Thauvin <nicolas(dot)thauvin(at)dalibo(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: segfault due to invalid cached plan
Date: 2018-03-28 13:54:11
Message-ID: 20180328155411.16476389@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

A customer sent us a core dump of the crash of the background worker of
the powa extension, running on 9.6.8 along side with cstore_fdw.

The background worker loops on a plpgsql function, which then execute
another plpgsql function that issues the query "TRUNCATE
powa_statements_history_current".

Here is the backtrace :

#0 0x0000000000603147 in list_copy (oldlist=0x2f106a8) at list.c:1176
#1 0x0000000000603380 in list_difference (list1=list1(at)entry=0x2f106a8, list2=list2(at)entry=0x0) at list.c:867
#2 0x00007f3b1545db1a in CStoreProcessUtility (parseTree=0x2f18108, queryString=0x2f17698 "TRUNCATE powa_statements_history_current", context=PROCESS_UTILITY_QUERY, paramListInfo=0x0, destReceiver=0xc19fe0 <spi_printtupDR>,
completionTag=0x7ffe2f207430 "3\001") at cstore_fdw.c:353
#3 0x00000000005eb84e in _SPI_execute_plan (plan=plan(at)entry=0x2f17ca8, paramLI=paramLI(at)entry=0x0, snapshot=snapshot(at)entry=0x0, crosscheck_snapshot=crosscheck_snapshot(at)entry=0x0, read_only=read_only(at)entry=0 '\000',
fire_triggers=fire_triggers(at)entry=1 '\001', tcount=tcount(at)entry=0) at spi.c:2202
#4 0x00000000005ebf58 in SPI_execute_plan_with_paramlist (plan=0x2f17ca8, params=0x0, read_only=<optimized out>, tcount=0) at spi.c:452
#5 0x00007f3ad716b6e5 in exec_stmt_execsql (estate=estate(at)entry=0x7ffe2f2077f0, stmt=stmt(at)entry=0x2d99ea8) at pl_exec.c:3515
#6 0x00007f3ad716ce57 in exec_stmt (stmt=0x2d99ea8, estate=0x7ffe2f2077f0) at pl_exec.c:1503
#7 exec_stmts (estate=0x7ffe2f2077f0, stmts=<optimized out>) at pl_exec.c:1398
#8 0x00007f3ad716ec8b in exec_stmt_block (estate=estate(at)entry=0x7ffe2f2077f0, block=0x2f091b8) at pl_exec.c:1336
#9 0x00007f3ad716eeab in plpgsql_exec_function (func=func(at)entry=0x2d8b738, fcinfo=fcinfo(at)entry=0x2f25628, simple_eval_estate=simple_eval_estate(at)entry=0x0) at pl_exec.c:434
#10 0x00007f3ad7163db7 in plpgsql_call_handler (fcinfo=0x2f25628) at pl_handler.c:255
#11 0x00000000005c55a2 in ExecMakeFunctionResultNoSets (fcache=0x2f255b8, econtext=0x2f253c8, isNull=0x2f25f40 "", isDone=<optimized out>) at execQual.c:2041
#12 0x00000000005cb0ce in ExecTargetList (tupdesc=<optimized out>, isDone=0x7ffe2f207b34, itemIsDone=0x2f26058, isnull=0x2f25f40 "", values=0x2f25f28, econtext=0x2f253c8, targetlist=0x2f26028) at execQual.c:5423
#13 ExecProject (projInfo=<optimized out>, isDone=isDone(at)entry=0x7ffe2f207b34) at execQual.c:5647
#14 0x00000000005e0162 in ExecResult (node=node(at)entry=0x2f252b8) at nodeResult.c:155
#15 0x00000000005c4678 in ExecProcNode (node=node(at)entry=0x2f252b8) at execProcnode.c:392
#16 0x00000000005c09d7 in ExecutePlan (dest=0xc19fe0 <spi_printtupDR>, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x2f252b8, estate=0x2f251a8)
at execMain.c:1566
#17 standard_ExecutorRun (queryDesc=0x2f04528, direction=<optimized out>, count=0) at execMain.c:338
#18 0x00007f3b15c7c0a5 in pgss_ExecutorRun (queryDesc=0x2f04528, direction=ForwardScanDirection, count=0) at pg_stat_statements.c:877
#19 0x00000000005ebbdf in _SPI_pquery (tcount=0, fire_triggers=1 '\001', queryDesc=0x2f04528) at spi.c:2418
#20 _SPI_execute_plan (plan=plan(at)entry=0x7ffe2f207e40, paramLI=paramLI(at)entry=0x0, snapshot=snapshot(at)entry=0x0, crosscheck_snapshot=crosscheck_snapshot(at)entry=0x0, read_only=read_only(at)entry=0 '\000',
fire_triggers=fire_triggers(at)entry=1 '\001', tcount=tcount(at)entry=0) at spi.c:2194
#21 0x00000000005ebe01 in SPI_execute (src=src(at)entry=0x2da7818 "SELECT powa_statements_aggregate()", read_only=<optimized out>, tcount=tcount(at)entry=0) at spi.c:389
#22 0x00007f3ad716d9aa in exec_stmt_dynexecute (stmt=0x2cfc520, estate=0x7ffe2f208660) at pl_exec.c:3707
#23 exec_stmt (stmt=0x2cfc520, estate=0x7ffe2f208660) at pl_exec.c:1507
#24 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#25 0x00007f3ad716eb29 in exec_stmt_block (estate=estate(at)entry=0x7ffe2f208660, block=block(at)entry=0x2cfd450) at pl_exec.c:1197
#26 0x00007f3ad716ca21 in exec_stmt (stmt=0x2cfd450, estate=0x7ffe2f208660) at pl_exec.c:1431
#27 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#28 0x00007f3ad716fda1 in exec_for_query (estate=estate(at)entry=0x7ffe2f208660, stmt=stmt(at)entry=0x2cfbe60, portal=0x2d59e08, prefetch_ok=prefetch_ok(at)entry=1 '\001') at pl_exec.c:5209
#29 0x00007f3ad716c5ea in exec_stmt_fors (stmt=0x2cfbe60, estate=0x7ffe2f208660) at pl_exec.c:2116
#30 exec_stmt (stmt=0x2cfbe60, estate=0x7ffe2f208660) at pl_exec.c:1467
#31 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#32 0x00007f3ad716dac8 in exec_stmt_if (stmt=0x2cfd6f0, estate=0x7ffe2f208660) at pl_exec.c:1699
#33 exec_stmt (stmt=0x2cfd6f0, estate=0x7ffe2f208660) at pl_exec.c:1447
#34 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#35 0x00007f3ad716ec8b in exec_stmt_block (estate=estate(at)entry=0x7ffe2f208660, block=0x2cffce0) at pl_exec.c:1336
#36 0x00007f3ad716eeab in plpgsql_exec_function (func=func(at)entry=0x2cbf0b8, fcinfo=fcinfo(at)entry=0x31a7c88, simple_eval_estate=simple_eval_estate(at)entry=0x0) at pl_exec.c:434
#37 0x00007f3ad7163db7 in plpgsql_call_handler (fcinfo=0x31a7c88) at pl_handler.c:255
#38 0x00000000005c55a2 in ExecMakeFunctionResultNoSets (fcache=0x31a7c18, econtext=0x31a7a28, isNull=0x31a85a0 "\001", isDone=<optimized out>) at execQual.c:2041
#39 0x00000000005cb0ce in ExecTargetList (tupdesc=<optimized out>, isDone=0x7ffe2f2089a4, itemIsDone=0x31a86b8, isnull=0x31a85a0 "\001", values=0x31a8588, econtext=0x31a7a28, targetlist=0x31a8688) at execQual.c:5423
#40 ExecProject (projInfo=<optimized out>, isDone=isDone(at)entry=0x7ffe2f2089a4) at execQual.c:5647
#41 0x00000000005e0162 in ExecResult (node=node(at)entry=0x31a7918) at nodeResult.c:155
#42 0x00000000005c4678 in ExecProcNode (node=node(at)entry=0x31a7918) at execProcnode.c:392
#43 0x00000000005c09d7 in ExecutePlan (dest=0xc19fe0 <spi_printtupDR>, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x31a7918, estate=0x31a7808)
at execMain.c:1566
#44 standard_ExecutorRun (queryDesc=0x2ecd8e8, direction=<optimized out>, count=0) at execMain.c:338
#45 0x00007f3b15c7c0a5 in pgss_ExecutorRun (queryDesc=0x2ecd8e8, direction=ForwardScanDirection, count=0) at pg_stat_statements.c:877
#46 0x00000000005ebbdf in _SPI_pquery (tcount=0, fire_triggers=1 '\001', queryDesc=0x2ecd8e8) at spi.c:2418
#47 _SPI_execute_plan (plan=plan(at)entry=0x7ffe2f208cb0, paramLI=paramLI(at)entry=0x0, snapshot=snapshot(at)entry=0x0, crosscheck_snapshot=crosscheck_snapshot(at)entry=0x0, read_only=read_only(at)entry=0 '\000',
fire_triggers=fire_triggers(at)entry=1 '\001', tcount=tcount(at)entry=0) at spi.c:2194
#48 0x00000000005ebe01 in SPI_execute (src=src(at)entry=0x7f3b15a77608 "SELECT powa_take_snapshot()", read_only=read_only(at)entry=0 '\000', tcount=tcount(at)entry=0) at spi.c:389
#49 0x00007f3b15a770ee in powa_main (main_arg=<optimized out>) at powa.c:290
#50 0x0000000000668dad in StartBackgroundWorker () at bgworker.c:742
#51 0x00000000006746e4 in do_start_bgworker (rw=0x2c58b40) at postmaster.c:5660
#52 maybe_start_bgworkers () at postmaster.c:5866
#53 0x0000000000674c14 in reaper (postgres_signal_arg=<optimized out>) at postmaster.c:2882
#54 <signal handler called>
#55 0x00007f3b20a80783 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#56 0x000000000046ef92 in ServerLoop () at postmaster.c:1685
#57 0x00000000006764b9 in PostmasterMain (argc=argc(at)entry=3, argv=argv(at)entry=0x2c16300) at postmaster.c:1329
#58 0x000000000047059e in main (argc=3, argv=0x2c16300) at main.c:228

The core shows that ProcessUtility hook of cstore_fdw produces a
segfault when trying to copy the relation list of the truncate
statement plan.

In frame 2:
(gdb) p *((TruncateStmt *)parseTree)
$26 = {type = T_TruncateStmt, relations = 0x2f106a8, restart_seqs = 0 '\000', behavior = DROP_RESTRICT}
(gdb) p *(((TruncateStmt *)parseTree)->relations)
$27 = {type = 49574520, length = 0, head = 0x0, tail = 0x2faaab8}

With this invalid list, standard_ProcessUtility would not have failed
but the relation would not have been truncated.

Here is the PlanCacheSource from frame 3, where this generic plan come
from:

(gdb) p *((CachedPlanSource *) plan->plancache_list->head->data.ptr_value)
$108 = {magic = 195726186, raw_parse_tree = 0x2f17598, query_string = 0x2f17698 "TRUNCATE powa_statements_history_current", commandTag = 0x91fd60 "TRUNCATE TABLE", param_types = 0x0, num_params = 0,
parserSetup = 0x7f3ad7165310 <plpgsql_parser_setup>, parserSetupArg = 0x2d99dc8, cursor_options = 0, fixed_result = 0 '\000', resultDesc = 0x0, context = 0x2faabc8, query_list = 0x2f17898, relationOids = 0x0, invalItems = 0x0,
search_path = 0x2f17af8, query_context = 0x2fa9688, rewriteRoleId = 10, rewriteRowSecurity = 1 '\001', dependsOnRLS = 0 '\000', gplan = 0x2f18208, is_oneshot = 0 '\000', is_complete = 1 '\001', is_saved = 1 '\001',
is_valid = 1 '\001', generation = 1, next_saved = 0x2f15c48,
generic_cost = 0, total_custom_cost = 0, num_custom_plans = 0}

Could you please give me some pointers to further investigate this
crash?

Thanks
--
Nicolas Thauvin
+33 (0)1 84 16 92 09
http://dalibo.com - http://dalibo.org

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-28 13:54:38 Re: Parallel Aggregates for string_agg and array_agg
Previous Message Arthur Zakirov 2018-03-28 13:51:45 Re: [PROPOSAL] Shared Ispell dictionaries