Re: Logical Replication WIP

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical Replication WIP
Date: 2016-09-07 12:10:58
Message-ID: 77181df0acef81e1564bc6591daaf8a8@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-08-31 22:51, Petr Jelinek wrote:
>
> and one more version with bug fixes, improved code docs and couple

I am not able to get the replication to work. Would you (or anyone) be
so kind to point out what I am doing wrong?

Patches applied, compiled, make-checked, installed OK.

I have 2 copies compiled and installed, logical_replication and
logical_replication2, to be publisher and subscriber, ports 6972 and
6973 respectively.

( BTW, there is no postgres user; OS user is 'aardvark'. 'aardvark is
also db superuser, and
it is also the user as which the two installations are installed. )

PGPASSFILE is set up and works for both instances.

both pg_hba.conf's changed to have:
local replication aardvark md5

instances.sh
--------------------------------------------------------------------
#!/bin/sh
project1=logical_replication # publisher
project2=logical_replication2 # subscriber
pg_stuff_dir=$HOME/pg_stuff
PATH1=$pg_stuff_dir/pg_installations/pgsql.$project1/bin:$PATH
PATH2=$pg_stuff_dir/pg_installations/pgsql.$project2/bin:$PATH
server_dir1=$pg_stuff_dir/pg_installations/pgsql.$project1
server_dir2=$pg_stuff_dir/pg_installations/pgsql.$project2
port1=6972
port2=6973
data_dir1=$server_dir1/data
data_dir2=$server_dir2/data
options1="
-c wal_level=logical
-c max_replication_slots=10
-c max_worker_processes=12
-c max_logical_replication_workers=10
-c max_wal_senders=10
-c logging_collector=on
-c log_directory=$server_dir1
-c log_filename=logfile.${project1} "

options2="
-c wal_level=logical
-c max_replication_slots=10
-c max_worker_processes=12
-c max_logical_replication_workers=10
-c max_wal_senders=10
-c logging_collector=on
-c log_directory=$server_dir2
-c log_filename=logfile.${project2} "

# start two instances:
export PATH=$PATH1; postgres -D $data_dir1 -p $port1 ${options1} &

export PATH=$PATH2; postgres -D $data_dir2 -p $port2 ${options2} &
--------------------------------------------------------------------

Both instances run fine.

On publisher db:
Create a table testt, with 20 rows.

CREATE PUBLICATION pub1 FOR TABLE testt ;
No problem.

On Subscriber db:
CREATE SUBSCRIPTION sub1 WITH CONNECTION 'host=/tmp dbname=testdb
port=6972' PUBLICATION pub1 INITIALLY DISABLED;
ALTER SUBSCRIPTION sub1 enable;

Adding rows to the table (publisher-side) gets activity going. I give
the resulting logs of both sides:

Logfile publisher side:
[...]
2016-09-07 13:47:44.287 CEST 21995 LOG: logical replication launcher
started
2016-09-07 13:51:42.601 CEST 22141 LOG: logical decoding found
consistent point at 0/230F478
2016-09-07 13:51:42.601 CEST 22141 DETAIL: There are no running
transactions.
2016-09-07 13:51:42.601 CEST 22141 LOG: exported logical decoding
snapshot: "00000702-1" with 0 transaction IDs
2016-09-07 13:52:11.326 CEST 22144 LOG: starting logical decoding for
slot "sub1"
2016-09-07 13:52:11.326 CEST 22144 DETAIL: streaming transactions
committing after 0/230F4B0, reading WAL from 0/230F478
2016-09-07 13:52:11.326 CEST 22144 LOG: logical decoding found
consistent point at 0/230F478
2016-09-07 13:52:11.326 CEST 22144 DETAIL: There are no running
transactions.
2016-09-07 13:53:47.012 CEST 22144 LOG: could not receive data from
client: Connection reset by peer
2016-09-07 13:53:47.012 CEST 22144 LOG: unexpected EOF on standby
connection
2016-09-07 13:53:47.025 CEST 22185 LOG: starting logical decoding for
slot "sub1"
2016-09-07 13:53:47.025 CEST 22185 DETAIL: streaming transactions
committing after 0/230F628, reading WAL from 0/230F5F0
2016-09-07 13:53:47.025 CEST 22185 LOG: logical decoding found
consistent point at 0/230F5F0
2016-09-07 13:53:47.025 CEST 22185 DETAIL: There are no running
transactions.
2016-09-07 13:53:47.030 CEST 22185 LOG: could not receive data from
client: Connection reset by peer
2016-09-07 13:53:47.030 CEST 22185 LOG: unexpected EOF on standby
connection
2016-09-07 13:53:52.044 CEST 22188 LOG: starting logical decoding for
slot "sub1"
2016-09-07 13:53:52.044 CEST 22188 DETAIL: streaming transactions
committing after 0/230F628, reading WAL from 0/230F5F0
2016-09-07 13:53:52.044 CEST 22188 LOG: logical decoding found
consistent point at 0/230F5F0
2016-09-07 13:53:52.044 CEST 22188 DETAIL: There are no running
transactions.
2016-09-07 13:53:52.195 CEST 22188 LOG: could not receive data from
client: Connection reset by peer
2016-09-07 13:53:52.195 CEST 22188 LOG: unexpected EOF on standby
connection
(repeat every few seconds)

Logfile subscriber-side:
[...]
2016-09-07 13:47:44.441 CEST 21997 LOG: MultiXact member wraparound
protections are now enabled
2016-09-07 13:47:44.528 CEST 21986 LOG: database system is ready to
accept connections
2016-09-07 13:47:44.529 CEST 22002 LOG: logical replication launcher
started
2016-09-07 13:52:11.319 CEST 22143 LOG: logical replication apply for
subscription sub1 started
2016-09-07 13:53:47.010 CEST 22143 ERROR: could not open relation with
OID 0
2016-09-07 13:53:47.012 CEST 21986 LOG: worker process: logical
replication worker 24048 (PID 22143) exited with exit code 1
2016-09-07 13:53:47.018 CEST 22184 LOG: logical replication apply for
subscription sub1 started
2016-09-07 13:53:47.028 CEST 22184 ERROR: could not open relation with
OID 0
2016-09-07 13:53:47.030 CEST 21986 LOG: worker process: logical
replication worker 24048 (PID 22184) exited with exit code 1
2016-09-07 13:53:52.041 CEST 22187 LOG: logical replication apply for
subscription sub1 started
2016-09-07 13:53:52.045 CEST 22187 ERROR: could not open relation with
OID 0
2016-09-07 13:53:52.046 CEST 21986 LOG: worker process: logical
replication worker 24048 (PID 22187) exited with exit code 1
(repeat every few seconds)

Any hints welcome.

Thanks!

Erik Rijkers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2016-09-07 12:14:03 Re: Speed up Clog Access by increasing CLOG buffers
Previous Message Vitaly Burovoy 2016-09-07 12:09:13 Re: identity columns