Re: Logical Replication WIP - FailedAssertion, File: "array_typanalyze.c", Line: 340

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Steve Singer <steve(at)ssinger(dot)info>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical Replication WIP - FailedAssertion, File: "array_typanalyze.c", Line: 340
Date: 2017-01-19 17:44:26
Message-ID: 14a31c5408a67d19770d1ba0a6086c01@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-01-19 01:02, Petr Jelinek wrote:

This causes the replica to crash:

#--------------
#!/bin/bash

# 2 instances on 6972 (master) and 6973 (replica)
# initially without publication or subscription

# clean logs
#echo >
/var/data1/pg_stuff/pg_installations/pgsql.logical_replication/logfile.logical_replication
#echo >
/var/data1/pg_stuff/pg_installations/pgsql.logical_replication2/logfile.logical_replication2

SLEEP=1
bail=0
pub_count=$( echo "select count(*) from pg_publication" | psql -qtAXp
6972 )
if [[ $pub_count -ne 0 ]]
then
echo "pub_count -ne 0 - deleting pub1 & bailing out"
echo "drop publication if exists pub1" | psql -Xp 6972
bail=1
fi
sub_count=$( echo "select count(*) from pg_subscription" | psql -qtAXp
6973 )
if [[ $sub_count -ne 0 ]]
then
echo "sub_count -ne 0 - deleting sub1 & bailing out"
echo "drop subscription if exists sub1" | psql -Xp 6973
bail=1
fi

if [[ $bail -eq 1 ]]
then
exit -1
fi

echo "drop table if exists testt;" | psql -qXap 6972
echo "drop table if exists testt;" | psql -qXap 6973

echo "-- on master (port 6972):
create table testt(id serial primary key, n integer, c text);
create publication pub1 for all tables; " | psql -qXap 6972

echo "-- on replica (port 6973):
create table testt(id serial primary key, n integer, c text);
create subscription sub1 connection 'port=6972' publication pub1 with
(disabled);
alter subscription sub1 enable; " | psql -qXap 6973

sleep $SLEEP

echo "table testt /*limit 3*/; select current_setting('port'), count(*)
from testt;" | psql -qXp 6972
echo "table testt /*limit 3*/; select current_setting('port'), count(*)
from testt;" | psql -qXp 6973

echo "-- now crash:
analyze pg_subscription" | psql -qXp 6973
#--------------

-- log of the replica:

2017-01-19 17:54:09.163 CET 224200 LOG: starting logical replication
worker for subscription "sub1"
2017-01-19 17:54:09.166 CET 21166 LOG: logical replication apply for
subscription sub1 started
2017-01-19 17:54:09.169 CET 21166 LOG: starting logical replication
worker for subscription "sub1"
2017-01-19 17:54:09.172 CET 21171 LOG: logical replication sync for
subscription sub1, table testt started
2017-01-19 17:54:09.190 CET 21171 LOG: logical replication
synchronization worker finished processing
TRAP: FailedAssertion("!(((array)->elemtype) == extra_data->type_id)",
File: "array_typanalyze.c", Line: 340)
2017-01-19 17:54:20.110 CET 224190 LOG: server process (PID 21183) was
terminated by signal 6: Aborted
2017-01-19 17:54:20.110 CET 224190 DETAIL: Failed process was running:
autovacuum: ANALYZE pg_catalog.pg_subscription
2017-01-19 17:54:20.110 CET 224190 LOG: terminating any other active
server processes
2017-01-19 17:54:20.110 CET 224198 WARNING: terminating connection
because of crash of another server process
2017-01-19 17:54:20.110 CET 224198 DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
2017-01-19 17:54:20.110 CET 224198 HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2017-01-19 17:54:20.111 CET 224190 LOG: all server processes
terminated; reinitializing
2017-01-19 17:54:20.143 CET 21184 LOG: database system was interrupted;
last known up at 2017-01-19 17:38:48 CET
2017-01-19 17:54:20.179 CET 21184 LOG: recovered replication state of
node 1 to 0/2CEBF08
2017-01-19 17:54:20.179 CET 21184 LOG: database system was not properly
shut down; automatic recovery in progress
2017-01-19 17:54:20.181 CET 21184 LOG: redo starts at 0/2513E88
2017-01-19 17:54:20.184 CET 21184 LOG: invalid record length at
0/2546980: wanted 24, got 0
2017-01-19 17:54:20.184 CET 21184 LOG: redo done at 0/2546918
2017-01-19 17:54:20.184 CET 21184 LOG: last completed transaction was
at log time 2017-01-19 17:54:09.191697+01
2017-01-19 17:54:20.191 CET 21184 LOG: MultiXact member wraparound
protections are now enabled
2017-01-19 17:54:20.193 CET 224190 LOG: database system is ready to
accept connections
2017-01-19 17:54:20.193 CET 21188 LOG: autovacuum launcher started
2017-01-19 17:54:20.194 CET 21190 LOG: logical replication launcher
started
2017-01-19 17:54:20.194 CET 21190 LOG: starting logical replication
worker for subscription "sub1"
2017-01-19 17:54:20.202 CET 21191 LOG: logical replication apply for
subscription sub1 started

Could probably be whittled down to something shorter but I hope it's
still easily reproduced.

thanks,

Erik Rijkers

setup of the 2 instances:

#---------------- ./instances.sh
#!/bin/bash
port1=6972
port2=6973
project1=logical_replication
project2=logical_replication2
# pg_stuff_dir=$HOME/pg_stuff
pg_stuff_dir=/var/data1/pg_stuff
PATH1=$pg_stuff_dir/pg_installations/pgsql.$project1/bin:$PATH
PATH2=$pg_stuff_dir/pg_installations/pgsql.$project2/bin:$PATH
server_dir1=$pg_stuff_dir/pg_installations/pgsql.$project1
server_dir2=$pg_stuff_dir/pg_installations/pgsql.$project2
data_dir1=$server_dir1/data
data_dir2=$server_dir2/data
options1="
-c wal_level=logical
-c max_replication_slots=10
-c max_worker_processes=12
-c max_logical_replication_workers=10
-c max_wal_senders=10
-c logging_collector=on
-c log_directory=$server_dir1
-c log_filename=logfile.${project1} "

options2="
-c wal_level=replica
-c max_replication_slots=10
-c max_worker_processes=12
-c max_logical_replication_workers=10
-c max_wal_senders=10
-c logging_collector=on
-c log_directory=$server_dir2
-c log_filename=logfile.${project2} "
which postgres
export PATH=$PATH1; postgres -D $data_dir1 -p $port1 ${options1} &
export PATH=$PATH2; postgres -D $data_dir2 -p $port2 ${options2} &
#---------------- ./instances.sh end

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2017-01-19 17:47:54 Re: pg_background contrib module proposal
Previous Message Robert Haas 2017-01-19 17:13:08 Re: [HACKERS] SEGFAULT in HEAD with replication