Re: Problem with pcp_recovery_node

From: Koshino Taiki <koshino(at)sraoss(dot)co(dot)jp>
To: "puska(at)ztm(dot)hr" <puska(at)ztm(dot)hr>
Cc: "pgpool-general(at)lists(dot)postgresql(dot)org" <pgpool-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Problem with pcp_recovery_node
Date: 2025-10-29 06:23:49
Message-ID: TYYP286MB29318B7424FA42665B37D00294FAA@TYYP286MB2931.JPNP286.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-general

Hi,

First, version 4.3.5 is outdated and contains known vulnerabilities, so it is recommended to upgrade to the latest version.

When executing pcp_recovery_node, the error "ERROR: executing recovery, execution of command failed at '1st stage'" occurs, but since the script runs normally when executed directly, it is necessary to check whether CREATE EXTENSION pgpool_recovery has been executed.

For more details about the error, check the PostgreSQL logs.

Taiki Koshino<koshino(at)sraoss(dot)co(dot)jp>
SRA OSS K.K.
TEL: 03-5979-2701 FAX: 03-5979-2702
URL: https://www.sraoss.co.jp/

________________________________
差出人: puska <puska(at)ztm(dot)hr>
送信日時: 2025年10月24日 21:43
宛先: pgpool-general(at)lists(dot)postgresql(dot)org <pgpool-general(at)lists(dot)postgresql(dot)org>
件名: Problem with pcp_recovery_node

Hi,

I'm trying to deploy pgpool2 cluster with the help of instructions form https://www.pgpool.net/docs/pgpool-II-4.3.5/en/html/example-cluster.html of and the thing that's bugging me at the moment is that I'm not able to create new node.
When I start (as root user or postgres user)

pcp_recovery_node -h 192.168.137.150 -p 9898 -U pgpool -n 1

I get:

ERROR: executing recovery, execution of command failed at "1st stage"
DETAIL: command:"/var/lib/postgresql/15/main/recovery_1st_stage"

but when I try calling recovery_1st_stage like:

/var/lib/postgresql/15/main/recovery_1st_stage /var/lib/postgresql/15/main sql2.ztm.hr /var/lib/postgresql/15/main 5432 1 5432 sql1.ztm.hr

the script passes without errors:

recovery_1st_stage: end: recovery_1st_stage is completed successfully

In pgpool2 log I get:

2025-10-24 14:25:03.657: pcp_main pid 328374: DEBUG: I am PCP child with PID:328374 and accept fd:7
2025-10-24 14:25:03.657: pcp_main pid 328374: LOG: forked new pcp worker, pid=338306 socket=7
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: I am PCP worker child with pid:338306
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: initializing backend status
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: received PCP packet
2025-10-24 14:25:03.658: pcp_child pid 338306: DETAIL: PCP packet type of service 'M'
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: PCP: sending md5 salt to client
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: received PCP packet
2025-10-24 14:25:03.658: pcp_child pid 338306: DETAIL: PCP packet type of service 'R'
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: PCP: processing authentication request
2025-10-24 14:25:03.658: pcp_child pid 338306: DETAIL: authentication OK
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: received PCP packet
2025-10-24 14:25:03.658: pcp_child pid 338306: DETAIL: PCP packet type of service 'O'
2025-10-24 14:25:03.658: pcp_child pid 338306: DEBUG: PCP: processing recovery request
2025-10-24 14:25:03.658: pcp_child pid 338306: DETAIL: start online recovery
2025-10-24 14:25:03.658: pcp_child pid 338306: LOG: starting recovering node 1
2025-10-24 14:25:03.677: pcp_child pid 338306: DEBUG: recovery execute checkpoint, start checkpoint
2025-10-24 14:25:03.688: pcp_child pid 338306: DEBUG: recovery execute checkpoint, finish checkpoint
2025-10-24 14:25:03.688: pcp_child pid 338306: LOG: node recovery, CHECKPOINT in the 1st stage done
2025-10-24 14:25:03.688: pcp_child pid 338306: LOG: executing recovery
2025-10-24 14:25:03.688: pcp_child pid 338306: DETAIL: starting recovery command: "SELECT pgpool_rec
overy('/var/lib/postgresql/15/main/recovery_1st_stage', 'sql2.ztm.hr', '/var/lib/postgresql/15/main',
'5432', 1, '5432', 'sql1.ztm.hr')"
2025-10-24 14:25:03.688: pcp_child pid 338306: LOG: executing recovery
2025-10-24 14:25:03.688: pcp_child pid 338306: DETAIL: disabling statement_timeout
2025-10-24 14:25:03.688: pcp_child pid 338306: DEBUG: executing recovery, start recovery
2025-10-24 14:25:03.690: pcp_child pid 338306: DEBUG: sending signal:12 to the parent process with P
ID:328328
2025-10-24 14:25:03.690: pcp_child pid 338306: ERROR: executing recovery, execution of command faile
d at "1st stage"
2025-10-24 14:25:03.690: pcp_child pid 338306: DETAIL: command:"/var/lib/postgresql/15/main/recovery_1st_stage"

I'm not sure if order of arguments from the log is correct, if I run recovery_1st_stage with argument ordered like in the log I'm getting error about host /var/lib/postgresql/15/main not being available.
Is there a way to get stderr from recovery_1st_stage while running it from pcp_recovery_node?

I installed pgpool2 version 4.3.5 from Debian 12 package, PostgreSQL is version 15.

pgpool2 config file attached

--

[Croatian Institute of Emergency Medicine]<http://www.hzhm.hr/>
Željko Puškarić / Voditelj informatičke službe | Head of ICT Sector
zpuskaric(at)hzhm(dot)hr<mailto:zpuskaric(at)hzhm(dot)hr> / +385 1 5496-092
Hrvatski zavod za hitnu medicinu | Croatian Institute of Emergency Medicine
Office: +385 1 5496-089 / Fax: +385 1 4677-180
Planinska 13, 10000 Zagreb, Hrvatska | Croatia
www.hzhm.hr<http://www.hzhm.hr/>

In response to

Browse pgpool-general by date

  From Date Subject
Next Message Luca Ferrari 2025-10-29 08:22:39 Re: what is causing different pcp_node_info status
Previous Message Tatsuo Ishii 2025-10-28 23:39:54 Re: [pgpool-general: 9486] Retiring slony mode