BUG #16219: EvalPlanQualFetchRowMark segfaults on Updates

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: julian(dot)schauder(at)gmx(dot)de
Subject: BUG #16219: EvalPlanQualFetchRowMark segfaults on Updates
Date: 2020-01-20 14:54:34
Message-ID: 16219-578ae955109d9fd2@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16219
Logged by: Julian Schauder
Email address: julian(dot)schauder(at)gmx(dot)de
PostgreSQL version: 12.1
Operating system: Ubuntu 18.04
Description:

Howdy Hackers,

for a week now (upgrade pg11=>12) we're experiencing a bug that causes
segfaults on a daily basis.
I've yet to reproduce this exact behaviour as the querys do not fail all the
time but rather now-and-then
when run concurrently. ( ~4 Crashes so far )

Outputs(identifiers) are anonymised.

> 2020-01-20 01:39:24.246 CET [2112] LOG: server process (PID 31599) was
terminated by signal 11: Segmentation fault
> 2020-01-20 01:39:24.246 CET [2112] DETAIL: Failed process was running:
UPDATE partitionedtable ia SET A=A, B=now() FROM temptable tt WHERE
ia.id=tt.ia_id AND A::text IS DISTINCT FROM tt.A::text;
> 2020-01-20 01:39:24.246 CET [2112] LOG: terminating any other active
server processes

> #0 EvalPlanQualFetchRowMark (epqstate=0x5576bfbccfc0, rti=2,
slot=slot(at)entry=0x5576c888b6c0) at
./build/../src/backend/executor/execMain.c:2590
> #1 0x00005576bdf449ce in ExecScanFetch (recheckMtd=0x5576bdf65740
<SeqRecheck>, accessMtd=0x5576bdf656b0 <SeqNext>, node=0x5576c888b500) at
./build/../src/backend/executor/execScan.c:115
> #2 ExecScan (node=0x5576c888b500, accessMtd=0x5576bdf656b0 <SeqNext>,
recheckMtd=0x5576bdf65740 <SeqRecheck>) at
./build/../src/backend/executor/execScan.c:200
> #3 0x00005576bdf63f7d in ExecProcNode (node=0x5576c888b500) at
./build/../src/include/executor/executor.h:239
> #4 ExecNestLoop (pstate=<optimized out>) at
./build/../src/backend/executor/nodeNestloop.c:109
> #5 0x00005576bdf3e9a9 in ExecProcNode (node=0x5576c888b310) at
./build/../src/include/executor/executor.h:239
> #6 EvalPlanQualNext (epqstate=epqstate(at)entry=0x5576bfbccfc0) at
./build/../src/backend/executor/execMain.c:2695
> #7 0x00005576bdf3ee77 in EvalPlanQual
(epqstate=epqstate(at)entry=0x5576bfbccfc0,
relation=relation(at)entry=0x7fbfd57df168, rti=42,
inputslot=inputslot(at)entry=0x5576cfb66578) at ./build/../src/backend/>
executor/execMain.c:2467
> #8 0x00005576bdf61f5e in ExecUpdate
(mtstate=mtstate(at)entry=0x5576bfbccec8, tupleid=0x7ffdd552f89a, oldtuple=0x0,
slot=<optimized out>, planSlot=planSlot(at)entry=0x5576c153f428,
epqstate=epqstate(at)entry=0x5576bfbccfc0, estate=0x5576bf5df080,
> canSetTag=true) at
./build/../src/backend/executor/nodeModifyTable.c:1387
> #9 0x00005576bdf62df9 in ExecModifyTable (pstate=0x5576bfbccec8) at
./build/../src/backend/executor/nodeModifyTable.c:2223
> #10 0x00005576bdf3c23d in ExecProcNode (node=0x5576bfbccec8) at
./build/../src/include/executor/executor.h:239
> #11 ExecutePlan (execute_once=<optimized out>, dest=0x5576be63ec00
<donothingDR>, direction=<optimized out>, numberTuples=0,
sendTuples=<optimized out>, operation=CMD_UPDATE,
use_parallel_mode=<optimized out>, planstate=0x5576bfbccec8,
> estate=0x5576bf5df080) at
./build/../src/backend/executor/execMain.c:1646
> #12 standard_ExecutorRun (queryDesc=0x5576bed15ae0, direction=<optimized
out>, count=0, execute_once=<optimized out>) at
./build/../src/backend/executor/execMain.c:364
> #13 0x00007fc89b1a4125 in pgss_ExecutorRun (queryDesc=0x5576bed15ae0,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at
./build/../contrib/pg_stat_statements/pg_stat_statements.c:893
> #14 0x00005576be0993b5 in ProcessQuery (plan=<optimized out>,
> sourceText=0x5576bed159d0 "UPDATE partitionedtable ia SET A=A, B=now()
FROM temptable tt WHERE ia.id=tt.ia_id AND A::text IS DISTINCT FROM
tt.A::text"...,
> params=0x0, queryEnv=0x0, dest=0x5576be63ec00 <donothingDR>,
completionTag=0x7ffdd552ff40 "") at
./build/../src/backend/tcop/pquery.c:161
> #15 0x00005576be0995f3 in PortalRunMulti
(portal=portal(at)entry=0x5576bea978f0, isTopLevel=isTopLevel(at)entry=true,
setHoldSnapshot=setHoldSnapshot(at)entry=false, dest=0x5576be63ec00
<donothingDR>, dest(at)entry=0x5576bea00890,
> altdest=0x5576be63ec00 <donothingDR>, altdest(at)entry=0x5576bea00890,
completionTag=completionTag(at)entry=0x7ffdd552ff40 "") at
./build/../src/backend/tcop/pquery.c:1283
> #16 0x00005576be09a1ff in PortalRun (portal=portal(at)entry=0x5576bea978f0,
count=count(at)entry=1, isTopLevel=isTopLevel(at)entry=true, run_once=<optimized
out>, dest=dest(at)entry=0x5576bea00890, altdest=altdest(at)entry=0x5576bea00890,

> completionTag=0x7ffdd552ff40 "") at
./build/../src/backend/tcop/pquery.c:796
> #17 0x00005576be097a31 in exec_execute_message (max_rows=1,
portal_name=0x5576bea00480 "") at
./build/../src/backend/tcop/postgres.c:2090
> #18 PostgresMain (argc=<optimized out>, argv=argv(at)entry=0x5576bea2fbd0,
dbname=<optimized out>, username=<optimized out>) at
./build/../src/backend/tcop/postgres.c:4297
> #19 0x00005576be01e53b in BackendRun (port=0x5576bea27dc0,
port=0x5576bea27dc0) at
./build/../src/backend/postmaster/postmaster.c:4437
> #20 BackendStartup (port=0x5576bea27dc0) at
./build/../src/backend/postmaster/postmaster.c:4128
> #21 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1704
> #22 0x00005576be01f513 in PostmasterMain (argc=5, argv=0x5576be9f9090) at
./build/../src/backend/postmaster/postmaster.c:1377
> #23 0x00005576bdd95cf6 in main (argc=5, argv=0x5576be9f9090) at
./build/../src/backend/main/main.c:228

> #0 EvalPlanQualFetchRowMark (epqstate=0x5576bfbccfc0, rti=2,
slot=slot(at)entry=0x5576c888b6c0) at
./build/../src/backend/executor/execMain.c:2590
> earm = 0x5576c7176258
> erm = 0x24
> datum = <optimized out>
> isNull = false
> __func__ = "EvalPlanQualFetchRowMark"
> #1 0x00005576bdf449ce in ExecScanFetch (recheckMtd=0x5576bdf65740
<SeqRecheck>, accessMtd=0x5576bdf656b0 <SeqNext>, node=0x5576c888b500) at
./build/../src/backend/executor/execScan.c:115
> slot = 0x5576c888b6c0
> epqstate = <optimized out>
> scanrelid = <optimized out>
> slot = <optimized out>
> estate = 0x5576c888b100
> estate = <optimized out>
> epqstate = <optimized out>
> scanrelid = <optimized out>
> slot = <optimized out>
> slot = <optimized out>
> slot = <optimized out>
> slot = <optimized out>
> #2 ExecScan (node=0x5576c888b500,

Excerpt shows the pointer 0x24 of erm to be uninitialized'ish. Once i can
capture a second
stacktrace i might be able to confirm if this is indeed a random value or
always 0x24. However,
we're workarounding this by simply not running this query concurrently
anymore hence i might
lack new stacktraces soon.

> UPDATE partitionedtable ia SET A=A, B=now() FROM temptable tt WHERE
ia.id=tt.ia_id AND A::text IS DISTINCT FROM tt.A::text;
> QUERY PLAN

>
-------------------------------------------------------------------------------------------------------------------------------
> Update on partitionedtable ia (cost=0.43..1202173.35 rows=465660
width=143)
> Update on partitionedtable ia
> Update on partitionedtable_0062 ia_1
> Update on partitionedtable_0061 ia_2
> Update on partitionedtable_0060 ia_3
> [repeats for another ~400 Partitions]
> [..]
> -> Nested Loop (cost=0.43..3386.78 rows=1194 width=145)
> -> Seq Scan on temptable tt (cost=0.00..22.00 rows=1200
width=46)
> -> Index Scan using partitionedtable_pk on partitionedtable ia
(cost=0.43..2.79 rows=1 width=560)
> Index Cond: (id = tt.ia_id)
> Filter: ((document)::text IS DISTINCT FROM
(tt.document_json)::text)
> -> Nested Loop (cost=0.42..3108.99 rows=1194 width=159)
> -> Seq Scan on temptable tt (cost=0.00..22.00 rows=1200
width=46)
> -> Index Scan using partitionedtable_062_pk on
partitionedtable_0062 ia_1 (cost=0.42..2.56 rows=1 width=518)
> Index Cond: (id = tt.ia_id)
> Filter: ((document)::text IS DISTINCT FROM
(tt.document_json)::text)
> [repeats for another ~400 Partitions]
> [..]
>
> JIT:
> Functions: 3510
> Options: Inlining true, Optimization true, Expressions true, Deforming
true
> (2344 rows)

The Query basically updates a massively partitioned* Table from the
contents
of a temptable.

*Inheritance-Partitioning on ID(not modified within the query)
*Inheritance has triggers. ISTM that they are not of any concern.

System is a current Ubuntu18.04 with a freshly pg_upgrade'd postgresql-12.1
pgdg-build,
including new statistics, indices and vacuum.

I'm currently trying to get this to a minimized and reproducible form as i
do not see an
obvious error in the code yet. For the time being i fail to actually
recreate as accurate as
production does :)

regards,
julian

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-01-20 16:14:26 Re: BUG #16219: EvalPlanQualFetchRowMark segfaults on Updates
Previous Message Raúl Marín 2020-01-20 10:04:05 Re: BUG #16199: pg_restore stuck on interrupts