Reliably determining whether the server came up

From: Mischa Sandberg <mischa_sandberg(at)telus(dot)net>
To: pgsql-admin(at)postgresql(dot)org
Subject: Reliably determining whether the server came up
Date: 2008-11-12 22:05:35
Message-ID: 1226527535.491b532f4c45e@legacywebmail.telus.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I've been trying to work out a reliable script to determine,
after pg_ctl start, that the server is done attempting
to come up, and that it has either succeeded OR FAILED.
This is for several hundred unattended appliance-type servers,
currently on PG 8.0 but soon to be on 8.3

Haven't found anything in the archives.
I want to determine success/failure without time-outs, since:
the db is restarted every time a server gets an upgrade,
and it can get several upgrades in a batch, and the cpu/disk load
during an upgrade is highly variable; a restart with no
recovery may still require as much as a minute to get to 'ready'.

We also need to restart the server several hundred times
in our in-house system tests.

So pg_ctl -w start is not an option, even if the timeout were
configurable to under a minute.

The best I have that doesn't involve modifying pg_ctl is:

# Hand-compute $NEXT_LOG from postgresql.conf
# parameters (log_directory) and (log_filename).
# Replace %S format with a '??' wildcard (yech).

$ TEMP_LOG=/tmp/pg.$PGPORT.log
$ touch $NEXT_LOG >$TEMP_LOG
$ FROM=`awk 'END {print NR+1}' $NEXT_LOG`
$ pg_ctl start -s -l $TEMPLOG
$ while tail +$FROM $NEXT_LOG | ! egrep -hw
'FATAL|PANIC|DETAIL|ready|shutting|^postmaster cannot' $TEMP_LOG -; do
sleep 1; done

The nasty cases are when the server fails (exits)
without being able to create its std log file (e.g.
error in postgresql.conf).

So I'm down to patching start_postmaster in pg_ctl.c
to use popen("... & echo $!") instead of system("... &"),
then make test_postmaster_connection do a kill(0,pid)
if PQsetdbLogin fails.

Any suggestions appreciated.
--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Julius Tuskenis 2008-11-13 08:16:57 Re: function executes sql 100 times longer it should
Previous Message paulo matadr 2008-11-12 18:54:24 Res: [GENERAL] MAX_CONNECTIONS ??