[sqlsmith] crashes in RestoreSnapshot on hot standby

From: Andreas Seltenreich <seltenreich(at)gmx(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [sqlsmith] crashes in RestoreSnapshot on hot standby
Date: 2016-06-30 21:26:26
Message-ID: 87r3bes70d.fsf@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Running sqlsmith on a streaming slave (master as of f8c5855) is
inconspicuous as long as the master is idle. As soon as I start it on
the master as well, the standby frequently crashes in RestoreSnapshot.
It doesn't seem to be specific to the queries, as they don't trigger a
crash when re-run.

Backtraces always look like the ones below.

regards,
Andreas

**** BEGIN BACKTRACE OF CORE FILE ./slave/postgres(dot)9826(at)(dot)core ON doombat ****
Core was generated by `postgres: smith regression [local] SELECT '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
167 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Datei oder Verzeichnis nicht gefunden.
(gdb) bt
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
#1 0x0000000000822032 in RestoreSnapshot (start_address=start_address(at)entry=0x7f2701d5a110 <error: Cannot access memory at address 0x7f2701d5a110>) at snapmgr.c:2020
#2 0x00000000004a934a in heap_beginscan_parallel (relation=0x2060a90, parallel_scan=parallel_scan(at)entry=0x7f2701d5a0f8) at heapam.c:1657
#3 0x00000000005fbedf in ExecSeqScanInitializeDSM (node=0x1f5b470, pcxt=0x221af88) at nodeSeqscan.c:327
#4 0x00000000005dd0ad in ExecParallelInitializeDSM (planstate=planstate(at)entry=0x1f5b470, d=d(at)entry=0x7ffd4ba200d0) at execParallel.c:245
#5 0x00000000005dd425 in ExecInitParallelPlan (planstate=0x1f5b470, estate=estate(at)entry=0x1f5ab28, nworkers=2) at execParallel.c:477
#6 0x00000000005ef4a4 in ExecGather (node=node(at)entry=0x1f5b048) at nodeGather.c:159
#7 0x00000000005dda48 in ExecProcNode (node=node(at)entry=0x1f5b048) at execProcnode.c:515
#8 0x00000000005f4b30 in ExecLimit (node=node(at)entry=0x1f5acd0) at nodeLimit.c:91
#9 0x00000000005dd9d8 in ExecProcNode (node=node(at)entry=0x1f5acd0) at execProcnode.c:531
#10 0x00000000005fef7c in ExecSetParamPlan (node=<optimized out>, econtext=0x1f5c138) at nodeSubplan.c:999
#11 0x00000000005e28b5 in ExecEvalParamExec (exprstate=<optimized out>, econtext=<optimized out>, isNull=0x22045b0 "", isDone=<optimized out>) at execQual.c:1135
#12 0x00000000005deb6d in ExecMakeFunctionResultNoSets (fcache=0x2204200, econtext=0x1f5c138, isNull=0x2203d98 "", isDone=<optimized out>) at execQual.c:2015
#13 0x00000000005de29a in ExecEvalCoalesce (coalesceExpr=<optimized out>, econtext=0x1f5c138, isNull=0x2203d98 "", isDone=<optimized out>) at execQual.c:3446
#14 0x00000000005deb6d in ExecMakeFunctionResultNoSets (fcache=0x22039e8, econtext=0x1f5c138, isNull=0x7ffd4ba203df "", isDone=<optimized out>) at execQual.c:2015
#15 0x00000000005e4939 in ExecQual (qual=<optimized out>, econtext=econtext(at)entry=0x1f5c138, resultForNull=resultForNull(at)entry=0 '\000') at execQual.c:5269
#16 0x00000000005faef1 in ExecResult (node=node(at)entry=0x1f5c020) at nodeResult.c:82
#17 0x00000000005ddbf8 in ExecProcNode (node=node(at)entry=0x1f5c020) at execProcnode.c:392
#18 0x00000000005d9c1f in ExecutePlan (dest=0x1ebb7d0, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1f5c020, estate=0x1f5ab28) at execMain.c:1567
#19 standard_ExecutorRun (queryDesc=0x1f5a718, direction=<optimized out>, count=0) at execMain.c:338
#20 0x00000000006f7238 in PortalRunSelect (portal=portal(at)entry=0x1d13be8, forward=forward(at)entry=1 '\001', count=0, count(at)entry=9223372036854775807, dest=dest(at)entry=0x1ebb7d0) at pquery.c:946
#21 0x00000000006f875e in PortalRun (portal=0x1d13be8, count=9223372036854775807, isTopLevel=<optimized out>, dest=0x1ebb7d0, altdest=0x1ebb7d0, completionTag=0x7ffd4ba20840 "") at pquery.c:787
#22 0x00000000006f6003 in exec_simple_query (query_string=<optimized out>) at postgres.c:1094
#23 PostgresMain (argc=30489576, argv=0x1ecfb08, dbname=0x1cf5a00 "regression", username=0x1ecfc20 "\b\373\354\001") at postgres.c:4074
#24 0x000000000046ca67 in BackendRun (port=0x1d17b50) at postmaster.c:4262
#25 BackendStartup (port=0x1d17b50) at postmaster.c:3936
#26 ServerLoop () at postmaster.c:1693
#27 0x0000000000690ab7 in PostmasterMain (argc=argc(at)entry=3, argv=argv(at)entry=0x1cf45e0) at postmaster.c:1301
#28 0x000000000046d9cd in main (argc=3, argv=0x1cf45e0) at main.c:228
(gdb) p debug_query_string
$1 = 0x1d68a78 "select \n sample_0.j as c0\nfrom \n public.testjsonb as sample_0 tablesample system (8) \nwhere cast(coalesce(pg_catalog.char_length(\n cast((select comment from public.room limit 1 offset 20)\n as text)),\n pg_catalog.pg_trigger_depth()) as integer) <> 3"

**** BEGIN BACKTRACE OF CORE FILE ./slave/postgres(dot)8104(at)(dot)core ON marbit ****
Core was generated by `postgres: bgworker: parallel worker for PID 2610 '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
167 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Datei oder Verzeichnis nicht gefunden.
(gdb) bt
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
#1 0x0000000000822032 in RestoreSnapshot (start_address=0x7f7b1ee4fa58 <error: Cannot access memory at address 0x7f7b1ee4fa58>) at snapmgr.c:2020
#2 0x00000000004e6142 in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1010
#3 0x0000000000683b62 in StartBackgroundWorker () at bgworker.c:726
#4 0x000000000068ec32 in do_start_bgworker (rw=0x22a8900) at postmaster.c:5535
#5 maybe_start_bgworker () at postmaster.c:5709
#6 0x000000000068f686 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4971
#7 <signal handler called>
#8 0x00007f7b1e3f3ac3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#9 0x000000000046c381 in ServerLoop () at postmaster.c:1657
#10 0x0000000000690ab7 in PostmasterMain (argc=argc(at)entry=3, argv=argv(at)entry=0x22825e0) at postmaster.c:1301
#11 0x000000000046d9cd in main (argc=3, argv=0x22825e0) at main.c:228

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-30 21:36:32 Re: fixing consider_parallel for upper planner rels
Previous Message Andrew Dunstan 2016-06-30 20:19:40 Re: WIP: About CMake v2