BUG #13985: Segmentation fault on PREPARE TRANSACTION

From: chris(dot)tessels(at)inergy(dot)nl
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #13985: Segmentation fault on PREPARE TRANSACTION
Date: 2016-02-24 12:58:24
Message-ID: 20160224125824.2573.27524@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 13985
Logged by: Chris Tessels
Email address: chris(dot)tessels(at)inergy(dot)nl
PostgreSQL version: 9.4.6
Operating system: CentOS release 6.6 (Final)
Description:

The first time we saw this segmentation fault was on an instance running
PostgreSQL 9.4.4.
To make sure the problem wasn't already fixed, we upgraded another instance
on identical hardware to PostgreSQL 9.4.6.
This would also exclude potential hardware related problems like bad
memory.
We also updated our JDBC driver to the latest version:
https://jdbc.postgresql.org/download/postgresql-9.4.1208.jar

Although we were able to make both systems crash, we were not able to
systematically reproduce the problem.

Here is the postgresql.log when the segfault occurred.

/data/postgres/pg_log/postgresql.log
2016-02-23 14:45:12.901 CET LOG: server process (PID 519) was terminated
by signal 11: Segmentation fault
2016-02-23 14:45:12.901 CET DETAIL: Failed process was running: PREPARE
TRANSACTION
'131077_AAAAAAAAAAAAAP//CjIGBaCG30NWzFRpAdtPATE=_AAAAAAAAAAAAAP//CjIGBaCG30NWzFRpAdtP4AAAAAIAAAAA'
2016-02-23 14:45:12.901 CET LOG: terminating any other active server
processes
2016-02-23 14:45:12.902 CET LOG: archiver process (PID 114131) exited
with exit code 1
2016-02-23 14:45:14.226 CET replication_user FATAL: the database system is
in recovery mode
2016-02-23 14:45:14.564 CET LOG: all server processes terminated;
reinitializing
2016-02-23 14:45:17.404 CET LOG: database system was interrupted; last
known up at 2016-02-23 14:31:36 CET
2016-02-23 14:53:06.753 CET mailinfo_ow FATAL: the database system is in
recovery mode
2016-02-23 14:53:06.760 CET mailinfo_ow FATAL: the database system is in
recovery mode
2016-02-23 14:53:08.830 CET replication_user FATAL: the database system is
in recovery mode
2016-02-23 14:53:10.203 CET LOG: record with zero length at 5/1409A540
2016-02-23 14:53:10.203 CET LOG: redo done at 5/1409A3C0
2016-02-23 14:53:10.203 CET LOG: last completed transaction was at log
time 2016-02-23 14:44:32.012791+01
2016-02-23 14:53:11.708 CET LOG: MultiXact member wraparound protections
are now enabled
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609585
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609594
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609596
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609601
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609591
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609588
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609593
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609584
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609595
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609586
2016-02-23 14:53:11.710 CET LOG: recovering prepared transaction
12609590
2016-02-23 14:53:11.756 CET LOG: autovacuum launcher started
2016-02-23 14:53:11.757 CET LOG: database system is ready to accept
connections

/var/log/messages
Feb 23 14:44:32 [hostname] kernel: postmaster[519]: segfault at
7fc2d8e6b634 ip 000000000066a95b sp 00007fff5ca887d0 error 4 in
postgres[400000+566000]


Analyzing the coredump gave us the following information:
sudo -u postgres gdb -q -c
/var/spool/abrt/ccpp-2016-02-23-16\:27\:50-77566/coredump
/usr/pgsql-9.4/bin/postgres

Core was generated by `postgres: mailinfo_ow mailinfo_ods 10.50.6.6(4188'.
Program terminated with signal 11, Segmentation fault.

#0 MinimumActiveBackends (min=50) at procarray.c:2472
2472 if (pgxact->xid == InvalidTransactionId)

Postgresql version:
PostgreSQL 9.4.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-16), 64-bit


Operating system and version:
[root(at)[hostname] ccpp-2016-02-23-16:27:50-77566]# cat /etc/redhat-release
CentOS release 6.6 (Final)
[root(at)[hostname] ccpp-2016-02-23-16:27:50-77566]# uname -a
Linux [hostname] 2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux


We installed Postgres using the following command:
yum install from http://yum.postgresql.org/9.4/redhat/rhel-6.6-x86_64/


Our configuration settings are:
name current_setting source
archive_command rsync -a %p /data/postgres_archive/%f configuration file
archive_mode on configuration file
bgwriter_delay 100ms configuration file
bgwriter_lru_maxpages 1000 configuration file
checkpoint_completion_target 0 configuration file
checkpoint_segments 192 configuration file
checkpoint_timeout 1h configuration file
client_encoding UTF8 client
commit_delay 50 configuration file
commit_siblings 50 configuration file
DateStyle ISO, MDY client
default_text_search_config pg_catalog.english configuration file
dynamic_shared_memory_type posix configuration file
effective_cache_size 16GB configuration file
effective_io_concurrency 1 configuration file
extra_float_digits 3 session
fsync on configuration file
full_page_writes off configuration file
hot_standby on configuration file
hot_standby_feedback on configuration file
lc_messages en_US.UTF-8 configuration file
lc_monetary en_US.UTF-8 configuration file
lc_numeric en_US.UTF-8 configuration file
lc_time en_US.UTF-8 configuration file
listen_addresses * configuration file
log_destination stderr configuration file
log_line_prefix %m %u configuration file
log_min_messages log configuration file
log_rotation_age 0 configuration file
log_rotation_size 1000MB configuration file
log_statement none configuration file
log_timezone CET configuration file
log_truncate_on_rotation on configuration file
logging_collector on configuration file
maintenance_work_mem 2GB configuration file
max_connections 400 configuration file
max_prepared_transactions 100 configuration file
max_stack_depth 2MB environment variable
max_standby_archive_delay 2min configuration file
max_standby_streaming_delay 2min configuration file
max_wal_senders 3 configuration file
port 5432 configuration file
shared_buffers 32GB configuration file
synchronous_commit off configuration file
temp_buffers 8MB configuration file
TimeZone Europe/Amsterdam client
wal_level hot_standby configuration file
work_mem 64MB configuration file

The program using to connect to PostgreSQL:
Wildfly 8.2.0 XA datasource with driver
https://jdbc.postgresql.org/download/postgresql-9.4.1208.jar

wildfly config:

<xa-datasource jndi-name="java:jboss/datasources/MailInfoXADS"
pool-name="MailInfoXADS" enabled="true" use-java-context="true">
<xa-datasource-property name="ServerName">
[hostname]
</xa-datasource-property>
<xa-datasource-property name="PortNumber">
5432
</xa-datasource-property>
<xa-datasource-property name="DatabaseName">
mailinfo_ods
</xa-datasource-property>
<driver>postgresql-jdbc4</driver>
<xa-pool>
<min-pool-size>5</min-pool-size>
<initial-pool-size>5</initial-pool-size>
<max-pool-size>40</max-pool-size>
<prefill>true</prefill>
</xa-pool>
<security>
<user-name>mailinfo_ow</user-name>
<password>xxxxxx</password>
</security>
<validation>
<valid-connection-checker
class-name="org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLValidConnectionChecker"/>
<exception-sorter
class-name="org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLExceptionSorter"/>
</validation>
</xa-datasource>

<drivers>
<driver name="postgresql-jdbc4" module="org.postgresql">
<xa-datasource-class>org.postgresql.xa.PGXADataSource</xa-datasource-class>
</driver>
</drivers>

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Oleksii Kliukin 2016-02-24 15:54:00 Re: BUG #13977: Strange behavior with WAL archive recovery
Previous Message Michael Paquier 2016-02-24 05:40:36 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby