Re: Primary node detection race at clean startup

From: Emond Papegaaij <emond(dot)papegaaij(at)gmail(dot)com>
To: pgpool-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Primary node detection race at clean startup
Date: 2026-05-12 10:49:31
Message-ID: CAGXsc+bBbfW7qQ+2JJ4SY7xZDifrz329cJcxHaoG+kpfGqBJaQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-hackers

Hi,

Something was wrong with the attached patch. It is rejected by patch,
probably because of the large context. Attached is a new version that
also works with patch.

Best regards,
Emond Papegaaij

Op di 12 mei 2026 om 10:38 schreef Emond Papegaaij <emond(dot)papegaaij(at)gmail(dot)com>:

>
> Hi,
>
> In our tests, we've found an issue that can cause all Pgpool nodes to
> report an incorrect 'Role: standby':
> Role : standby ← stale, never updated on this node
> Backend Role : primary ← actual SR-check result
>
> This can happen if all nodes in a watchdog cluster start with a clean
> state at the same time. If the first node is still trying to determine
> the primary database, it's primary_node_id is -2. This value is then
> synced to other nodes in the cluster, causing all nodes to report the
> stale state indefinitely. Attached is a patch against 4.7 that should
> fix this.
>
> Note that this analysis was done by Claude Code and it also created
> the patch. The failure on our CI was real though and I think the
> explanation makes sense.
>
> Best regards,
> Emond Papegaaij

Attachment Content-Type Size
pgpool-keep-local-primary-when-leader-initial.patch application/x-patch 2.5 KB

In response to

Browse pgpool-hackers by date

  From Date Subject
Next Message Koshino Taiki 2026-05-13 02:46:14 Notification of updated Pgpool-II download URLs
Previous Message Emond Papegaaij 2026-05-12 08:38:08 Primary node detection race at clean startup