what is causing different pcp_node_info status

From: Luca Ferrari <fluca1978(at)gmail(dot)com>
To: pgpool-general(at)lists(dot)postgresql(dot)org
Subject: what is causing different pcp_node_info status
Date: 2025-10-17 10:45:16
Message-ID: CAKoxK+6mkk5r3WREDUduu-ZQH0S87x=cddYDMDyvYyfTzDxj1w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-general

Hi all,
I'm using PgPool 4.6.2, PostgreSQL 17.5, three nodes:
- pg1 primary
- pg2 replica
- pg3 replica

So far so good, the initial situation of pcp_node_info reported by all
the nodes seems consistent:

% ssh pg1 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 2 0.166667 up up primary primary 0 none none 2025-10-16 11:49:45
pg2 5432 2 0.333333 up up standby standby 0 streaming async 2025-10-16 12:02:10
pg3 5432 2 0.500000 up up standby standby 0 streaming async 2025-10-16 11:49:53

% ssh pg2 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-16 11:51:36
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-16 12:02:09
pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
2025-10-16 11:51:36

% ssh pg3 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-16 11:41:22
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-16 12:02:09
pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
2025-10-16 11:49:53

Replicas are streaming, so I reboot pg3 (at 17:30) and the situation
appears different:

% ssh pg3 'sudo reboot'

From now on, pg3 is seen as down from pgpool, while the replica is
correctly streaming:

% ssh pg1 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 2 0.166667 up up primary primary 0 none none 2025-10-16 11:49:45
pg2 5432 2 0.333333 up up standby standby 0 streaming async 2025-10-16 12:02:10
pg3 5432 2 0.500000 up up standby standby 0 streaming async 2025-10-16 17:32:45

% ssh pg2 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-16 11:51:36
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-16 12:02:09
pg3 5432 3 0.500000 down up standby standby 1078 streaming async
2025-10-16 17:32:45

% ssh pg3 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-16 17:32:44
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-16 17:32:44
pg3 5432 3 0.500000 down up standby standby 896 streaming async
2025-10-16 17:32:45

The problem is that the status of pg3 is not consistent: pg1 reports
as "up up" while other nodes report as "down up".
Please note that the updtae timestamp are in sync, 17:32:45 and hence
the updated seems to have been propagated correctly within pgpool.
Questions are:
- is it safe right now to execute a pcp_attach_node ?
- why is the status not consistent? Anything I should look for within the logs?

Thanks,
Luca

Browse pgpool-general by date

  From Date Subject
Previous Message Bo Peng 2025-10-17 08:00:15 Re: Automating Failover Resync & Re-Attach in pgpool2