BUG #14053: postgres crashes in plpgsql under load of concurrent transactions

From: maxim(at)jeslalabs(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14053: postgres crashes in plpgsql under load of concurrent transactions
Date: 2016-03-30 07:04:14
Message-ID: 20160330070414.8944.52106@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14053
Logged by: Maxim Zakharov
Email address: maxim(at)jeslalabs(dot)com
PostgreSQL version: 9.4.6
Operating system: Ubuntu
Description:

2016-03-30 14:30:09 AEDT [21935-17] LOG: database system is ready to accept
connections
2016-03-30 15:14:08 AEDT [21935-18] LOG: server process (PID 4070) was
terminated by signal 11: Segmentation fault
2016-03-30 15:14:08 AEDT [21935-19] LOG: terminating any other active
server processes
2016-03-30 15:14:08 AEDT [5441-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5441-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5441-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5402-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5402-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5402-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5419-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5419-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5419-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5400-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5400-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5400-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5398-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5398-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5398-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5395-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5395-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5395-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5385-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5385-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5385-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5377-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5377-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5377-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [3855-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [3855-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [3855-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [29127-2] WARNING: terminating connection because
of crash of another server process
2016-03-30 15:14:08 AEDT [29127-3] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2016-03-30 15:14:08 AEDT [29127-4] HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [29145-1] root(at)db_purgora WARNING: terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [29145-2] root(at)db_purgora DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [29145-3] root(at)db_purgora HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [21935-20] LOG: all server processes terminated;
reinitializing
2016-03-30 15:14:09 AEDT [8506-1] LOG: database system was interrupted;
last known up at 2016-03-30 15:10:10 AEDT
2016-03-30 15:14:09 AEDT [8507-1] root(at)db_purgora FATAL: the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8508-1] root(at)db_purgora FATAL: the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8509-1] root(at)db_purgora FATAL: the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8510-1] root(at)db_purgora FATAL: the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8511-1] root(at)db_purgora FATAL: the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8512-1] root(at)db_purgora FATAL: the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8506-2] LOG: database system was not properly
shut down; automatic recovery in progress
2016-03-30 15:14:09 AEDT [8506-3] LOG: redo starts at 0/F3EB6208
2016-03-30 15:14:09 AEDT [8506-4] LOG: record with zero length at
0/F418A208
2016-03-30 15:14:09 AEDT [8506-5] LOG: redo done at 0/F418A1D8
2016-03-30 15:14:09 AEDT [8506-6] LOG: last completed transaction was at
log time 2016-03-30 15:14:08.091602+11
2016-03-30 15:14:09 AEDT [8506-7] LOG: MultiXact member wraparound
protections are now enabled
2016-03-30 15:14:09 AEDT [8516-1] LOG: autovacuum launcher started
2016-03-30 15:14:09 AEDT [21935-21] LOG: database system is ready to accept
connections

# gdb /usr/lib/postgresql/9.4/bin/postgres
/tmp/core-postgres-11-120-129-4070-1459311248
GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.4/bin/postgres...Reading symbols
from
/usr/lib/debug/.build-id/e4/3a4b938ce055dcbff82e81a44d43e708d6081f.debug...done.
done.
[New LWP 4070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: postgres db_purgora [local] SELECT
'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f8f42521521 in exec_eval_simple_expr (rettype=<optimised out>,
isNull=0x7ffcef42e143 "",
result=<synthetic pointer>, expr=0x55d62c60dab8, estate=0x7ffcef42e320)
at
/build/postgresql-9.4-IgGWiy/postgresql-9.4-9.4.6/build/../src/pl/plpgsql/src/pl_exec.c:5203
5203 /build/postgresql-9.4-IgGWiy/postgresql-9.4-9.4.6/build/../src/pl/plpgsql/src/pl_exec.c:
No such file or directory.
(gdb) p *expr
$1 = {dtype = 5, dno = 0, query = 0x55d62c60db48 "SELECT false", plan =
0x55d62ca07570, paramnos = 0x0,
func = 0x55d62ca0ccb0, ns = 0x55d62c991e20, expr_simple_expr =
0x55d62ca07b90, expr_simple_generation = 1,
expr_simple_type = 16, expr_simple_state = 0x0, expr_simple_in_use = 1
'\001', expr_simple_lxid = 0}
(gdb) p curlxid
$2 = <optimised out>
(gdb) p MyProc->lxid
$3 = 0

It looks like both MyProc->lxid (and therefore curlxid) and
expr->expr_simple_lxid are equal to InvalidLocalTransactionId (which is
defined to be 0) and are equal to each other, which means the following code
under if statement of exec_eval_simple_expr() in
src/pl/plpgsql/src/pl_exec.c (line 5158) is not executed:

if (expr->expr_simple_lxid != curlxid)
{
oldcontext =
MemoryContextSwitchTo(estate->simple_eval_estate->es_query_cxt);
expr->expr_simple_state = ExecInitExpr(expr->expr_simple_expr, NULL);
expr->expr_simple_in_use = false;
expr->expr_simple_lxid = curlxid;
MemoryContextSwitchTo(oldcontext);
}

Perhaps the fix should be as follow:
if (expr->expr_simple_lxid != curlxid || curlxid ==
InvalidLocalTransactionId)
{
oldcontext =
MemoryContextSwitchTo(estate->simple_eval_estate->es_query_cxt);
expr->expr_simple_state = ExecInitExpr(expr->expr_simple_expr, NULL);
expr->expr_simple_in_use = false;
expr->expr_simple_lxid = curlxid;
MemoryContextSwitchTo(oldcontext);
}

Sorry, I have not checked the validity of this fix.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message tsingle 2016-03-30 08:38:19 BUG #14054: "create index using gist ..." on large table never completes.
Previous Message Jarosław Stokłosa 2016-03-30 06:55:15 Re: BUG #14046: Bad mathematical rules for 0 cast