Postgres - BDR issue

From: Rahul Goel <er(dot)rahulgoel(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Postgres - BDR issue
Date: 2015-09-22 15:52:00
Message-ID: CAH8foHB3DpYV8b+2fWjek35DJfHc4SPG2ymrpxSyw7OCk_eihg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

I am facing the below issue in setting up BDR:

I have 2 nodes (For simplicity, I will refer them as node 1 & node 2). BDR
group was created from Node 1. When a new postgres node (i.e. node 2) joins
the group, then the node_status in bdr.bdr_nodes table of new node (i.e.
node 2) show 'r', but node_status remains 'i' on the upstream master (i.e.
node 1). I could see conflict has happened in bdr.bdr_nodes table, and node
1 is unable to update the status of node 2, but couldn't able to find the
solution

*Node 1 (BDR group was created from this node):*
(Masked DB Name, and password)

*psql -U postgres -d xyz -c "select * from bdr.bdr_nodes;"*

*node_sysid | node_timeline | node_dboid | node_status | node_name
| node_local_dsn | node_init_from_dsn
---------------------+---------------+------------+-------------+---------------+-----------------------------------------------------------------------------+----------------------
6197340597374984280
| 1 | 16385 | r | 10.42.157.193 | port=5432
dbname=xyzdb host=10.42.157.193 user=postgres password=password |
6197344706786291803 | 1 | 12156 | i |
10.42.99.96 | port=5432 dbname=xyzdb host=10.42.99.96 user=postgres
password=password | port=5432 dbname=xyzdb host=10.42.157.193
user=postgres password=password(2 rows)*

Logs

*< 2015-09-22 14:16:36.244 UTC >STATEMENT: CREATE SCHEMA public;<
2015-09-22 14:16:42.615 UTC >LOG: registering background worker "bdr db:
xyzdb"< 2015-09-22 14:16:42.615 UTC >LOG: starting background worker
process "bdr db: xyzdb"< 2015-09-22 14:23:16.498 UTC >LOG: logical
decoding found consistent point at 0/879E980< 2015-09-22 14:23:16.498 UTC
>DETAIL: There are no running transactions.< 2015-09-22 14:23:16.498 UTC
>LOG: exported logical decoding snapshot: "00000511-1" with 0 transaction
IDs< 2015-09-22 14:23:25.284 UTC >LOG: starting logical decoding for slot
"bdr_16385_6197344706786291803_1_12156__"< 2015-09-22 14:23:25.284 UTC
>DETAIL: streaming transactions committing after 0/879E9B8, reading WAL
from 0/879E980< 2015-09-22 14:23:25.284 UTC >LOG: logical decoding found
consistent point at 0/879E980< 2015-09-22 14:23:25.284 UTC >DETAIL: There
are no running transactions.< 2015-09-22 14:23:25.294 UTC >LOG: could not
receive data from client: Connection reset by peer< 2015-09-22 14:23:25.294
UTC >LOG: unexpected EOF on standby connection< 2015-09-22 14:23:26.299
UTC >LOG: registering background worker "bdr
(6197340597374984280,1,16385,)->bdr (6197344706786291803,1,"< 2015-09-22
14:23:26.299 UTC >LOG: starting background worker process "bdr
(6197340597374984280,1,16385,)->bdr (6197344706786291803,1,"< 2015-09-22
14:23:26.311 UTC >LOG: starting logical decoding for slot
"bdr_16385_6197344706786291803_1_12156__"< 2015-09-22 14:23:26.311 UTC
>DETAIL: streaming transactions committing after 0/87B0998, reading WAL
from 0/879E9B8< 2015-09-22 14:23:26.313 UTC >LOG: logical decoding found
consistent point at 0/879E9B8< 2015-09-22 14:23:26.313 UTC >DETAIL:
Logical decoding will begin using saved snapshot.< 2015-09-22 14:23:26.539
UTC >LOG: CONFLICT: remote UPDATE on relation bdr.bdr_nodes originating at
node 6197344706786291803:1:12156 at ts 2015-09-22 14:23:21.776464+00; row
was previously updated at node 0:0. Resolution:
last_update_wins_keep_local; PKEY: node_sysid[text]:6197344706786291803
node_timeline[oid]:1 node_dboid[oid]:12156 node_status[char]:i
node_name[text]:10.42.99.96 node_local_dsn[text]:port=5432 dbname=xyzdb
host=10.42.99.96 user=postgres password=password
node_init_from_dsn[text]:port=5432 dbname=xyzdb host=10.42.157.193
user=postgres password=password*

*Node 2 (check the status of node here. It's ready but in node 1 it is
initializing)*
(Masked DB Name, and password)

[root(at)3c8668f9183c /]# psql -U postgres -d hubub -c "select * from
bdr.bdr_nodes;"

node_sysid | node_timeline | node_dboid | node_status |
node_name | node_local_dsn | node_init_from_dsn

---------------------+---------------+------------+-------------+---------------+-----------------------------------------------------------------------------+----------------------

6197340597374984280 | 1 | 16385 | r |
10.42.157.193 | port=5432 dbname=hubub host=10.42.157.193 user=postgres
password=qsV6hKyW94 |
6197344706786291803 | 1 | 12156 | r |
10.42.99.96 | port=5432 dbname=hubub host=10.42.99.96 user=postgres
password=qsV6hKyW94 | port=5432 dbname=hubub host=10.42.157.193
user=postgres password=qsV6hKyW94
(2 rows)

Logs
< 2015-09-22 14:23:11.824 UTC >LOG: registering background worker "bdr db:
xyzdb"
< 2015-09-22 14:23:11.824 UTC >LOG: starting background worker process
"bdr db: xyzdb"
< 2015-09-22 14:23:11.875 UTC >LOG: Creating replica with:
/usr/pgsql-9.4/bin/bdr_initial_load --snapshot 00000511-1 --source
"port=5432 dbname=xyzdb host=10.42.157.193 user=postgres password=password"
--target "port=5432 dbname=xyzdb host=10.42.99.96 user=postgres
password=password" --tmp-directory "/tmp/postgres-bdr-00000511-1.259",
--pg-dump-path "/usr/pgsql-9.4/bin/pg_dump", --pg-restore-path
"/usr/pgsql-9.4/bin/pg_restore"
Dumping remote database "port=5432 dbname=xyzdb host=10.42.157.193
user=postgres password=password fallback_application_name='bdr
(6197344706786291803,1,12156,): init_replica dump'" with 1 concurrent
workers to "/tmp/postgres-bdr-00000511-1.259"
Restoring dump to local DB "port=5432 dbname=xyzdb host=10.42.99.96
user=postgres password=password fallback_application_name='bdr
(6197344706786291803,1,12156,): init_replica restore' options='-c
bdr.do_not_replicate=on -c bdr.permit_unsafe_ddl_commands=on -c
bdr.skip_ddl_replication=on -c bdr.skip_ddl_locking=on'" with 1 concurrent
workers from "/tmp/postgres-bdr-00000511-1.259"
< 2015-09-22 14:23:20.632 UTC >LOG: registering background worker "bdr:
catchup apply to 0/87B0DE0"
< 2015-09-22 14:23:20.632 UTC >LOG: starting background worker process
"bdr: catchup apply to 0/87B0DE0"
< 2015-09-22 14:23:20.653 UTC >LOG: bdr apply finished processing;
replayed to 0/87B0DE0 of required 0/87B0DE0
< 2015-09-22 14:23:20.654 UTC >LOG: worker process: bdr: catchup apply to
0/87B0DE0 (PID 275) exited with exit code 0
< 2015-09-22 14:23:20.654 UTC >LOG: unregistering background worker "bdr:
catchup apply to 0/87B0DE0"
< 2015-09-22 14:23:21.655 UTC >LOG: registering background worker "bdr
(6197344706786291803,1,12156,)->bdr (6197340597374984280,1,"
< 2015-09-22 14:23:21.655 UTC >LOG: starting background worker process
"bdr (6197344706786291803,1,12156,)->bdr (6197340597374984280,1,"
< 2015-09-22 14:23:21.684 UTC >LOG: logical decoding found consistent
point at 0/86E0910
< 2015-09-22 14:23:21.684 UTC >DETAIL: There are no running transactions.
< 2015-09-22 14:23:21.685 UTC >LOG: exported logical decoding snapshot:
"0000055B-1" with 0 transaction IDs
< 2015-09-22 14:23:21.691 UTC >LOG: starting logical decoding for slot
"bdr_12156_6197340597374984280_1_16385__"
< 2015-09-22 14:23:21.691 UTC >DETAIL: streaming transactions committing
after 0/86E0948, reading WAL from 0/86E0910
< 2015-09-22 14:23:21.691 UTC >LOG: logical decoding found consistent
point at 0/86E0910
< 2015-09-22 14:23:21.691 UTC >DETAIL: There are no running transactions.

Thanks in advance for the help!

Regards
Rahul Goel
er(dot)rahulgoel(at)gmail(dot)com
647 949 1679

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-22 15:59:21 Re: [COMMITTERS] pgsql: Use gender-neutral language in documentation
Previous Message Syed, Rahila 2015-09-22 15:24:38 Re: [PROPOSAL] VACUUM Progress Checker.