Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed

From: r(dot)zharkov(at)postgrespro(dot)ru
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed
Date: 2019-04-03 08:29:21
Message-ID: 4f71a201d23777478ae48c548dd6f9d1@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2019-04-03 00:13, Tom Lane wrote:
> r(dot)zharkov(at)postgrespro(dot)ru writes:
>> I made the new core file:
>
> Thanks, but this isn't much help --- it just shows what we already
> know, ie the "cannot abort transaction" error occurs after some other
> error was thrown.
>
> What would be more helpful would be to adjust the places that
> can throw the "unexpected table_lock_tuple status" error text to be
> PANIC rather than ERROR, so that the core dump would show a stack
> trace showing how we got to whichever of those places this is.
> Or, run the test with a higher log verbosity, so that you can find
> out which of those places is throwing the error to begin with;
> then just PANIC-ize that one.
>
> FWIW, I see six potential candidates, not two:
>
> pgsql/src/backend/commands/trigger.c: 3380: elog(ERROR,
> "unexpected table_lock_tuple status: %u", test);
> pgsql/src/backend/executor/nodeLockRows.c: 232: elog(ERROR,
> "unexpected table_lock_tuple status: %u",
> pgsql/src/backend/executor/nodeModifyTable.c: 881: elog(ERROR,
> "unexpected table_lock_tuple status: %u",
> pgsql/src/backend/executor/nodeModifyTable.c: 1384: elog(ERROR,
> "unexpected table_lock_tuple status: %u",
> pgsql/src/backend/executor/execReplication.c: 211: elog(ERROR,
> "unexpected table_lock_tuple status: %u", res);
> pgsql/src/backend/executor/execReplication.c: 375: elog(ERROR,
> "unexpected table_lock_tuple status: %u", res);
>
>
> regards, tom lane

Hello,
I have made the new core dump. Here are the results:
PostgreSQL 12devel-master/bb76134

Log:
TRAP: FailedAssertion("!((((tuple->t_data->t_infomask) & 0x0080) ||
(((tuple->t_data->t_infomask) & (0x1000 | ((0x0040 | 0x0010) | 0x0040 |
0x0010))) == 0x0040)))", File: "heapam.c", Line:
2019-04-03 15:13:36.668 +07 [23163] LOG: server process (PID 27904) was
terminated by signal 6: Aborted
2019-04-03 15:13:36.668 +07 [23163] DETAIL: Failed process was running:
UPDATE pgbench_accounts SET abalance = 1 WHERE aid = 2;
2019-04-03 15:13:36.668 +07 [23163] LOG: terminating any other active
server processes
2019-04-03 15:13:36.669 +07 [27907] WARNING: terminating connection
because of crash of another server process

Backtrace:
[New LWP 27904]
[Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: zharkov test 127.0.0.1(44582) UPDATE
'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f5e0dc7b428 in __GI_raise (sig=sig(at)entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f5e0dc7b428 in __GI_raise (sig=sig(at)entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f5e0dc7d02a in __GI_abort () at abort.c:89
#2 0x0000000000a9b7ce in ExceptionalCondition (
conditionName=0xb3de08 "!((((tuple->t_data->t_infomask) & 0x0080) ||
(((tuple->t_data->t_infomask) & (0x1000 | ((0x0040 | 0x0010) | 0x0040 |
0x0010))) == 0x0040)))",
errorType=0xb39a69 "FailedAssertion", fileName=0xb39ae8 "heapam.c",
lineNumber=4286) at assert.c:54
#3 0x00000000004ea024 in heap_lock_tuple (relation=0x7f5e0f252d80,
tuple=0x30f4dd0, cid=0, mode=LockTupleNoKeyExclusive,
wait_policy=LockWaitBlock, follow_updates=false,
buffer=0x7ffed968cbb4, tmfd=0x7ffed968cd80) at heapam.c:4286
#4 0x00000000004f5903 in heapam_tuple_lock (relation=0x7f5e0f252d80,
tid=0x7ffed968cdf0, snapshot=0x303cef0, slot=0x30f4d80, cid=0,
mode=LockTupleNoKeyExclusive,
wait_policy=LockWaitBlock, flags=2 '\002', tmfd=0x7ffed968cd80) at
heapam_handler.c:352
#5 0x000000000074055a in table_lock_tuple (rel=0x7f5e0f252d80,
tid=0x7ffed968cdf0, snapshot=0x303cef0, slot=0x30f4d80, cid=0,
mode=LockTupleNoKeyExclusive, wait_policy=LockWaitBlock,
flags=2 '\002', tmfd=0x7ffed968cd80) at
../../../src/include/access/tableam.h:993
#6 0x00000000007421cc in ExecUpdate (mtstate=0x30eb780,
tupleid=0x7ffed968cdf0, oldtuple=0x0, slot=0x30ed1c8,
planSlot=0x30ed050, epqstate=0x30eb878, estate=0x30eb3d8,
canSetTag=true)
at nodeModifyTable.c:1242
#7 0x0000000000743756 in ExecModifyTable (pstate=0x30eb780) at
nodeModifyTable.c:2070
#8 0x0000000000712671 in ExecProcNodeFirst (node=0x30eb780) at
execProcnode.c:445
#9 0x000000000070748c in ExecProcNode (node=0x30eb780) at
../../../src/include/executor/executor.h:239
#10 0x0000000000709d11 in ExecutePlan (estate=0x30eb3d8,
planstate=0x30eb780, use_parallel_mode=false, operation=CMD_UPDATE,
sendTuples=false, numberTuples=0,
direction=ForwardScanDirection, dest=0x30ee508, execute_once=true)
at execMain.c:1643
#11 0x0000000000707ac9 in standard_ExecutorRun (queryDesc=0x303faf8,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:362
#12 0x0000000000707905 in ExecutorRun (queryDesc=0x303faf8,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:306
#13 0x0000000000915950 in ProcessQuery (plan=0x30df120,
sourceText=0x30161d8 "UPDATE pgbench_accounts SET abalance = 1 WHERE aid
= 2;", params=0x0, queryEnv=0x0, dest=0x30ee508,
completionTag=0x7ffed968d270 "") at pquery.c:161
#14 0x000000000091739c in PortalRunMulti (portal=0x3098078,
isTopLevel=true, setHoldSnapshot=false, dest=0x30ee508,
altdest=0x30ee508, completionTag=0x7ffed968d270 "") at pquery.c:1283
#15 0x00000000009168e8 in PortalRun (portal=0x3098078,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x30ee508, altdest=0x30ee508, completionTag=0x7ffed968d270 "")
at pquery.c:796
#16 0x000000000091027a in exec_simple_query (query_string=0x30161d8
"UPDATE pgbench_accounts SET abalance = 1 WHERE aid = 2;") at
postgres.c:1215
#17 0x00000000009148ff in PostgresMain (argc=1, argv=0x3042980,
dbname=0x3042960 "test", username=0x3012d98 "zharkov") at
postgres.c:4247
#18 0x000000000086406c in BackendRun (port=0x303b030) at
postmaster.c:4399
#19 0x00000000008637cd in BackendStartup (port=0x303b030) at
postmaster.c:4090
#20 0x000000000085facf in ServerLoop () at postmaster.c:1703
#21 0x000000000085f2dd in PostmasterMain (argc=3, argv=0x3010d00) at
postmaster.c:1376
#22 0x000000000077aeba in main (argc=3, argv=0x3010d00) at main.c:228

The changes I made:
diff --git a/src/backend/commands/trigger.c
b/src/backend/commands/trigger.c
index e03ffdd..ab3bace 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3358,7 +3358,7 @@ GetTupleForTrigger(EState *estate,
ereport(ERROR,

(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
errmsg("could
not serialize access due to concurrent update")));
- elog(ERROR, "unexpected table_lock_tuple
status: %u", test);
+ elog(PANIC, "unexpected table_lock_tuple
status: %u", test);
break;

case TM_Deleted:
diff --git a/src/backend/executor/execReplication.c
b/src/backend/executor/execReplication.c
index d8b48c6..d06cbe0 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -207,7 +207,7 @@ retry:
elog(ERROR, "attempted to lock invisible
tuple");
break;
default:
- elog(ERROR, "unexpected table_lock_tuple
status: %u", res);
+ elog(PANIC, "unexpected table_lock_tuple
status: %u", res);
break;
}
}
@@ -371,7 +371,7 @@ retry:
elog(ERROR, "attempted to lock invisible
tuple");
break;
default:
- elog(ERROR, "unexpected table_lock_tuple
status: %u", res);
+ elog(PANIC, "unexpected table_lock_tuple
status: %u", res);
break;
}
}
diff --git a/src/backend/executor/nodeLockRows.c
b/src/backend/executor/nodeLockRows.c
index 7674ac8..c7a261c 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -229,7 +229,7 @@ lnext:
ereport(ERROR,

(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
errmsg("could
not serialize access due to concurrent update")));
- elog(ERROR, "unexpected table_lock_tuple
status: %u",
+ elog(PANIC, "unexpected table_lock_tuple
status: %u",
test);
break;

diff --git a/src/backend/executor/nodeModifyTable.c
b/src/backend/executor/nodeModifyTable.c
index 7be0e77..b15b7a0 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -780,7 +780,7 @@ ldelete:;
* locking the
latest version via
*
TUPLE_LOCK_FLAG_FIND_LAST_VERSION.
*/
- elog(ERROR,
"unexpected table_lock_tuple status: %u",
+ elog(PANIC,
"unexpected table_lock_tuple status: %u",

result);
return NULL;
}
@@ -1269,7 +1269,7 @@ lreplace:;

default:
/* see
table_lock_tuple call in ExecDelete() */
- elog(ERROR,
"unexpected table_lock_tuple status: %u",
+ elog(PANIC,
"unexpected table_lock_tuple status: %u",

result);
return NULL;
}

regards,

Roman Zharkov

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message r.zharkov 2019-04-03 08:38:16 Re: BUG #15727: PANIC: cannot abort transaction 295144144, it was already committed
Previous Message Magnus Hagander 2019-04-03 08:10:21 Re: Re: BUG #15731: CVE-2019-9193