BUG #15114: logical decoding Segmentation fault

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: pet(dot)slavov(at)gmail(dot)com
Subject: BUG #15114: logical decoding Segmentation fault
Date: 2018-03-15 09:24:55
Message-ID: 152110589574.1223.17983600132321618383@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15114
Logged by: Peter Slavov
Email address: pet(dot)slavov(at)gmail(dot)com
PostgreSQL version: 10.3
Operating system: Ubuntu xenial
Description:

Hi,
I have a Segmentation fault problem when trying to start logical decoding on
the latest Postgres version - 10.3 (package version 10.3-1.pgdg16.04+1).
Here is what happens:

--- Query on master db server ---
create publication big_tables FOR TABLE table1, table2, table3, table4;

--- Query on the logical replica db server
create subscription sub_name CONNECTION 'host=master.server dbname=db_name
user=db_user password=db_password' PUBLICATION big_tables ;

--- Code dump: ---
------------------
Reading symbols from /usr/lib/postgresql/10/bin/postgres...Reading symbols
from
/usr/lib/debug/.build-id/a3/2aff963eb198ec48d1946523ef49379fa49c5e.debug...done.
done.
[New LWP 26112]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by 'postgres: 10/main: wal sender process lg_replica
ip-10-4-1-11.eu-west-1.compute'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 GetActiveSnapshot () at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/utils/time/snapmgr.c:843
843 /build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/utils/time/snapmgr.c:
No such file or directory.
(gdb)

--- Logs ---
------------
# Master db server:
2018-03-14 23:30:35 UTC [23352]: [7-1] user=,db= host=,app=[] LOG: server
process (PID 26112) was terminated by signal 11: Segmentation fault
2018-03-14 23:30:35 UTC [23352]: [8-1] user=,db= host=,app=[] LOG:
terminating any other active server processes

# logical Replica db server:
2018-03-14 23:29:22 UTC [6777]: [3-1] user=postgres,db=db_replica_name
host=[local],app=[psql] LOG: duration: 73.901 ms statement: create
subscription sub_name CONNECTION 'host=master.server dbname=db_name
user=db_user password=db_password' PUBLICATION big_tables ;
2018-03-14 23:29:22 UTC [6779]: [1-1] user=,db= host=,app=[] LOG: logical
replication apply worker for subscription "sub_name" has started
2018-03-14 23:29:22 UTC [6780]: [1-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table1" has started
2018-03-14 23:29:22 UTC [6781]: [1-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table2" has started
2018-03-14 23:29:23 UTC [6780]: [2-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table1" has finished
2018-03-14 23:29:23 UTC [6782]: [1-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table3" has started
2018-03-14 23:29:24 UTC [6781]: [2-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table2" has finished
2018-03-14 23:29:24 UTC [6783]: [1-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table4" has started
2018-03-14 23:29:26 UTC [6783]: [2-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table4" has finished
2018-03-14 23:29:26 UTC [6782]: [2-1] user=,db= host=,app=[] LOG: logical
replication table synchronization worker for subscription "sub_name", table
"table3" has finished
2018-03-14 23:30:35 UTC [6779]: [2-1] user=,db= host=,app=[] ERROR: could
not receive data from WAL stream: SSL SYSCALL error: EOF detected
2018-03-14 23:30:35 UTC [6739]: [7-1] user=,db= host=,app=[] LOG: worker
process: logical replication worker for subscription 17178 (PID 6779) exited
with exit code 1
2018-03-14 23:30:35 UTC [6798]: [1-1] user=,db= host=,app=[] LOG: logical
replication apply worker for subscription "sub_name" has started
2018-03-14 23:30:35 UTC [6798]: [2-1] user=,db= host=,app=[] ERROR: could
not connect to the publisher: FATAL: the database system is in recovery
mode

--- Configurations ---
----------------------
Default debian postgresql.conf with this changes:

listen_addresses = '*'

max_connections = 500

shared_buffers = 1GB
work_mem = 4MB
maintenance_work_mem = 128MB
effective_cache_size = 4GB
effective_io_concurrency = 5

wal_level = logical
max_wal_senders = 10
max_replication_slots = 10
max_worker_processes = 20
wal_log_hints = off
hot_standby = on

shared_preload_libraries = 'pg_stat_statements'

tcp_keepalives_idle = 300
tcp_keepalives_interval = 10
tcp_keepalives_count = 6

log_min_duration_statement = 5
log_lock_waits = on
log_checkpoints = on

log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = '%A.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d host=%h,app=[%a] '
log_timezone = 'UTC'
log_statement = 'none'
log_duration = off
log_min_duration_statement = 0
log_hostname = on
log_checkpoints = on
log_connections = on
log_disconnections = on

wal_compression = on

ssl_cert_file = '/etc/postgresql/ssl/database_public.crt'
ssl_key_file = '/etc/postgresql/ssl/database_private.key'

Does it sound familiar? what am I doing wrong ?

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2018-03-15 14:56:27 Re: BUG #15114: logical decoding Segmentation fault
Previous Message PG Bug reporting form 2018-03-15 06:41:01 BUG #15113: alter table .. add column .. default null leads to table rewrite