Re: [HACKERS] WAL archiving idle database

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Brian Wipf" <brian(at)clickspace(dot)com>, <pgsql-general(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [HACKERS] WAL archiving idle database
Date: 2007-10-29 14:56:37
Message-ID: 4725AE55.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

>>> On Fri, Oct 26, 2007 at 6:39 PM, in message
<1193441976(dot)7624(dot)128(dot)camel(at)dogma(dot)ljc(dot)laika(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>
wrote:
> On Fri, 2007-10-26 at 18:06 -0500, Kevin Grittner wrote:
>> Hmmm... We would actually prefer to get the WAL file at the
>> specified interval. We have software to ensure that the warm
>> standby instances are not getting stale, and that's pretty simple
>> with the current behavior.
>
> Another thought: when you say it's "pretty simple", what do you do now?
> My monitoring scripts for this particular situation employ some pretty
> ugly code.

Here's our script:

#! /bin/bash

if [ "$1" == "" ] ; then
savepwd=$PWD
cd /var/pgsql/data/county/
find * -maxdepth 0 -type d | xargs -idirname $0 dirname
cd $savepwd
exit 0
fi

for countyName ; do
echo County: $countyName
/usr/local/pgsql/bin/pg_controldata /var/pgsql/data/county/$countyName/data | grep -E '(Database cluster state|pg_control last modified)'
/etc/init.d/postgresql-${countyName}-cc status
grep basebackup /var/pgsql/data/county/$countyName/data/basebackup-of-this-instance
echo ''
done

Here's an example of running it (although the opcenter usually runs
it without a parameter, to get all counties):

opcenter(at)PGBACKUP:~> sudo pgstatus.sh iowa
County: iowa
Database cluster state: in archive recovery
pg_control last modified: Mon 29 Oct 2007 09:03:16 AM CDT
pg_ctl: server is running (PID: 15902)
/usr/local/pgsql-8.2.4/bin/postgres -D /var/pgsql/data/county/iowa/data
basebackup cc-2007-10-26_190001

This gets parsed by a script in our monitor (python, I think) and
winds up feeding a status display. It's probably a bit crude, but
it has worked well for us, with very little time required to get it
going. This thread has made me aware that it is dependent on the
checkpoint frequency as well as the archive frequency. Our
checkpoint_timeout setting is 30min and our archive_timeout is the
default (one hour). The monitor shows red if the cluster state
isn't "in archvie recovery" or pg_ctl doesn't report "server is
running" or the last modified is older than 75 minutes.

We are OK with a one hour archive interval because we have a
separate application-oriented transaction stream (independent of
the database product) coming back real-time, which we can replay
to "top off" a database backup.

-Kevin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2007-10-29 15:54:53 Re: [HACKERS] WAL archiving idle database
Previous Message Pit M. 2007-10-29 14:44:05 Problems with PostgreSQL DBI-Link / DBD-ODBC

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2007-10-29 15:54:53 Re: [HACKERS] WAL archiving idle database
Previous Message Andrew Dunstan 2007-10-29 14:37:29 Re: [HACKERS] grep command