Re: Problem with pcp process

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: pgpool-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Problem with pcp process
Date: 2026-04-26 05:56:19
Message-ID: 20260426.145619.2101180528950972824.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-hackers

> Koshino told me off-list that following script does not work:
> -------------------------------------
> pgpool_setup -n 3 --no-stop
> pg_ctl -D data2 stop
> while true
> do
> psql -p 11000 -c "show pool_nodes" test
> if [ $? = 0 ];then
> break;
> fi
> sleep 1
> done
> psql -p 11000 -c "show pool_nodes" test
> pcp_recovery_node -p 11001 -n 2;pcp_promote_node -p 11001 -n 2 -s -g
> -------------------------------------
>
> pcp_recovery_node reports success but pcp_promote_node just hangs. I
> found pcp worker process loops infinitely around line 584 in
> pool_detach_node (pcp_worker.c):
>
> while (!pcp_worker_wakeup_request)
> {
> struct timeval t = {1, 0};
>
> select(0, NULL, NULL, NULL, &t);
> }
>
> pcp_worker_wakeup_request is a variable supposed to be set to 1 by
> SIGUSR2 signal handler. When pgpool main finishes failover requests
> from pcp, it sends SIGUSR2 to pcp main process, then it forwards to
> pcp worker process, and its signal handler sets the variable to 1. To
> find the process id to forward the signal, pcp main process keeps a
> list of pids of forked children (pcp worker process) in its local
> memory.
>
> Upon failover, pgpool main sends a signal to pcp main process to
> request restarting, and pgpool main restarts. Problem is, when pcp
> main restarts, it forgets the list of pids. As a result, when pgpool
> main sends SIGUSR2 to pcp main, it cannot find the pid to send the
> signal to, which causes the infinite loop in pcp worker process.
>
> To fix the problem, we could delay the restarting of pcp main until it
> delivers the signal. Unfortunately this does not work, since pgpool
> main waits for pcp main process to exit. Thus processing failover does
> not proceed in pgpool main.
>
> So I decided to add a new shared memory area to hold the pcp workers
> pids as an array. Upon restarting of pcp main process, it reads the
> pids from the shared memory into its local memory. When child process
> is forked, its pid is added to the shared memory array. When child
> process exits, its pid in the array is cleared to 0, representing an
> empty slot.
>
> Attached is a patch to implement it.

In the patch there were duplicate for loops in
pcp_child.c:reaper(). Attached v2 patch removes it. Also update
copright year of pcp_child.c.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment Content-Type Size
v2-0001-Fix-pcp-main-process-to-remember-child-pids-upon-.patch application/octet-stream 5.7 KB

In response to

Browse pgpool-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2026-04-30 10:15:09 120.memory_leak_extended_memqcache fails on master
Previous Message Tatsuo Ishii 2026-04-25 13:30:51 Problem with pcp process