Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Ashesh Vashi <ashesh(dot)vashi(at)enterprisedb(dot)com>, Mark Llewellyn <mark_llewellyn(at)adp(dot)com>, pgsql-hackers(at)postgresql(dot)org, Sujeet Rajguru <sujeet(dot)rajguru(at)enterprisedb(dot)com>
Subject: Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Date: 2010-11-24 05:14:45
Message-ID: 201011240514.oAO5Ejd22656@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > > Tom Lane wrote:
> > >> Possibly the cleanest fix is to implement pg_ping as a libpq function.
> > >> You do have to distinguish connection failures (ie connection refused)
> > >> from errors that came back from the postmaster, and the easiest place to
> > >> be doing that is inside libpq.
> >
> > > OK, so a new libpq function --- got it. Would we just pass the status
> > > from the backend or can it be done without backend modifications?
> >
> > It would definitely be better to do it without backend mods, so that
> > the functionality would work against back-branch postmasters.
> >
> > To my mind, the entire purpose of such a function is to classify the
> > possible errors so that the caller doesn't have to. So I wouldn't
> > consider that it ought to "pass back the status from the backend".
> > I think what we basically want is a function that takes a conninfo
> > string (or one of the variants of that) and returns an enum defined
> > more or less like this:
> >
> > * failed to connect to postmaster
> > * connected, but postmaster is not accepting sessions
> > * postmaster is up and accepting sessions
> >
> > I'm not sure those are exactly the categories we want, but something
> > close to that. In particular, I don't know if there's any value in
> > subdividing the "not accepting sessions" status --- pg_ctl doesn't
> > really care, but other use-cases might want to tell the difference
> > between the various canAcceptConnections failure states.
> >
> > BTW, it is annoying that we can't definitively distinguish "postmaster
> > is not running" from a connectivity problem, but I can't see a way
> > around that.
>
> Agreed. I will research this.

I have researched this and developed the attached patch. It implements
PGping() and PGpingParams() in libpq, and has pg_ctl use it for pg_ctl
-w server status detection.

The new output for cases where .pgpass is not allowing for a connection
is:

$ pg_ctl -w -l /dev/null start
waiting for server to start.... done
server started
However, could not connect, perhaps due to invalid authentication or
misconfiguration.

The code basically checks the connection status between PQconnectStart()
and connectDBComplete() to see if the server is running but we failed to
connect for some reason.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
/pgpatches/pg_ctl_v2 text/x-diff 9.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Cristiano 2010-11-24 11:34:08 BUG #5765: pg_dump fail to find upper case table name
Previous Message abraham camacho 2010-11-23 22:42:05 BUG #5764: installation error (incomplete process)

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-11-24 06:02:11 Re: Instrument checkpoint sync calls
Previous Message Robert Haas 2010-11-24 05:08:37 Re: profiling connection overhead