Re: [PATCH] pg_isready (was: [WIP] pg_ping utility)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Phil Sorber <phil(at)omniti(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_isready (was: [WIP] pg_ping utility)
Date: 2013-01-27 17:54:13
Message-ID: CAFj8pRCpwx5BZOkcO=7RsDNcyafpRoc7Kaw9Pa1sYdQoCvjedw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2013/1/27 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Craig Ringer <craig(at)2ndQuadrant(dot)com> writes:
>> That's what it sounds like - confirming that PostgreSQL is really fully
>> shut down.
>
>> I'm not sure how you could do that over a protocol connection, myself.
>> I'd just read the postmaster pid from the pidfile on disk and then `kill
>> -0` it in a delay loop until the `kill` command returns failure. This
>> could be a useful convenience utility but I'm not convinced it should be
>> added to pg_isready because it requires local and possibly privileged
>> execution, unlike pg_isready's network based operation. Privileges could
>> be avoided by using an aliveness test other than `kill -0`, but you
>> absolutely have to be local to verify that the postmaster has fully
>> terminated - and it wouldn't make sense for a non-local process to care
>> about this anyway.
>
> This problem is actually quite a bit more difficult than it looks.
> In particular, the mere fact that the postmaster process is gone does
> not prove that the cluster is idle: it's possible that the postmaster
> crashed leaving orphan backends behind, and the orphans are still busily
> modifying on-disk state. A real postmaster knows how to check for that
> (by looking at the nattch count of the shmem segment cited in the old
> lockfile) but I can't see any shell script getting it right.
>
> So ATM I wouldn't trust any method short of "try to start a new
> postmaster and see if it works", which of course is not terribly helpful
> if your objective is to get to a stopped state.
>
> We could consider transposing the shmem logic into a new pg_ctl command.
> It might be better though to have a new switch in the postgres
> executable that just runs postmaster startup as far as detecting
> lockfile conflicts, and reports what it found (without ever launching
> any child processes that could confuse matters). Then "pg_ctl isdone"
> could be a frontend for that, instead of duplicating logic.
>

+1

Pavel

> regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2013-01-27 17:57:11 Re: Event Triggers: adding information
Previous Message Tom Lane 2013-01-27 17:46:35 Re: [PATCH] pg_isready (was: [WIP] pg_ping utility)