Creating an 8.2 warm-standby demo system

From: "Charles Duffy" <charles(dot)duffy(at)gmail(dot)com>
To: sydpug(at)postgresql(dot)org
Subject: Creating an 8.2 warm-standby demo system
Date: 2006-10-10 09:32:40
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: sydpug


Here are the notes on putting together an 8.2 based warm standby
demonstration system. I hope people find them useful.

As I mentioned in the presentation, the basic idea behind warm standby
is to employ a shell script which periodically polls the WAL archive
directory, executing the archive recovery procedure when it detects
that a new WAL segment has arrived. The standby server thus processes
archived segments as and when they are sent from the master.

The shell script is specified in recovery.conf in the
'restore_command' parameter in recovery.conf.

The problem of yet-to-be-archived transactions being lost in the event
of a failure on the master is addressed by the addition of the
'archive_timeout' parameter in postgresql.conf, which will force a WAL
segment to be archived every <archive_timeout> seconds.

The procedure outlined below is just intended to produce a
demonstration system - a production HA setup would be very different.

1) Obtain the 8.2 beta source distribution from

2) Create a directory tree into which you can install the 'master' and
'slave' postgres servers.

mkdir -p ~/pg82demo/master
mkdir -p ~/pg82demo/slave
mkdir -p ~/pg82demo/wal_archive

3) Extract the postgres source archive. From the top level, execute:

./configure --prefix=/home/test/pg82demo/master
make install

And then install the slave:

./configure --prefix=/home/test/pg82demo/slave
make install

(Replace 'test' in the above with the appropriate string for your
configuration. Throughout the rest of the notes, I'll specify the user

4) Go to the 'master' postgres installation, and initialise a database cluster:

cd ~/pg82demo/master
bin/initdb data/

Make the following entries in data/postgresql.conf:

archive_command = 'rsync -arv %p /home/test/pg82demo/wal_archive/%f </dev/null'
archive_timeout = 60

The reason for using 'rsync' instead of 'cp' will be explained later.

5) Start the master with

bin/pg_ctl -D data/ start -l <logfile>

6) Connect to the master with psql, and initiate a base backup with:

select pg_start_backup('test1');

Put psql in the background with ctrl-z, and take a backup of the data directory:

tar cvzf base.tgz data

Bring back psql with 'fg', and finalise the base backup:

select pg_stop_backup();

Check that postgres has written the history file to the specified
archive directory. These operations have to be performed as the
database superuser.

7) Go to the 'slave' installation. Copy the base backup archive over
from the master, and extract it:

cd ../slave
cp ../master/base.tgz .
tar xvzf base.tgz
rm -f base.tgz

8) Modify the slave's postgresql.conf as follows:

port = 5433

The 'port' entry is modified so that the slave won't try to bind to
the same port that the master has already bound to.

Also, comment out the 'archive_command' and 'archive_timeout' entries
in the slave's 'postgresql.conf' file.

9) Delete the '' file from the slave's data directory:

rm -f data/

10) Copy the recovery.conf.sample file from the 'share/postgresql/'
direcory into the slave's data directory as 'recovery.conf'

11) Save the attached shell script '' to ~/pg82demo/, and
make sure it is set executable.

12) Edit the shell script, and make sure the directory paths for
TRIGGER_FILE are correct.

Don't worry about what the 'trigger file' is yet exactly. The
TRIGGER_FILE assignment statement in the shell script should look


With <username> replaced with your configuration's path, of course.

13) Add the following to the data/recovery.conf on the slave:

restore_command = '/home/test/pg82demo/
/home/test/pg82demo/wal_archive/%f "%p"'

14) Start the slave server with:

bin/pg_ctl -D data/ -l <logfile>

15) Connect to the master, and trigger some WAL archive activity by
creating a large table:

create table foo as (select a,b from pg_class as a, pg_class as b);

The master's log should show output like:

LOG: archived transaction log file "000000010000000000000000"
LOG: archived transaction log file "000000010000000000000000.00399A30.backup"

And the slave's log should contain entries like:

LOG: restored log file "000000010000000000000000.00399A30.backup" from archive
`/home/test/pg82test/wal_archive/000000010000000000000000' ->
LOG: restored log file "000000010000000000000000" from archive
LOG: checkpoint record is at 0/399A30
LOG: redo record is at 0/399A30; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 0/595; next OID: 10818
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: automatic recovery in progress
LOG: redo starts at 0/399A78

As a new segment is archived by the master, it will log a message like:

LOG: archived transaction log file "000000010000000000000001"

Immediately the archive is generated, there should be a corresponding
message on the slave which looks like:

`/home/test/pg82demo/wal_archive/000000010000000000000001' ->
LOG: restored log file "000000010000000000000001" from archive

16) To initiate a failover from the master to the slave, create the
'trigger file':

touch ~/pg82demo/trigger

This should immediately cause the slave to finish processing archived
segments, exit recovery mode, and come up ready for use.

To repeat the test, it is not necessary to repeat the entire procedure above.
Stop the slave, and delete its data directory. Then, continue the
procedure from step (7) - that is, start the recovery procedure on the
slave, beginning with extracting the base backup.

The slave will process all the WAL segments in the archive directory,
and stop after the last one. Until the failover is triggered, it will
then wait for more segments to process, as described above.

The reason rsync is used in the archive_command is that rsync features
an 'atomic copy' - that is, the in-progress destination file is
created as a temp file, and then renamed when the copy is complete. In
the situation above, where segments are archived straight to the
directory that the slave reads from, 'cp' can cause an error whereby
the slave attempts to process a partially-copied WAL segment. If this
happens, postgres will emit an error like:

PANIC: archive file "000000010000000000000031" has wrong size:
1810432 instead of 16777216
LOG: startup process (PID 11356) was terminated by signal 6
LOG: aborting startup due to startup process failure

Another wrinkle in the whole setup is the fact that when in recovery
mode, postgres will ask for files to be recovered that do not, and
will not, exist. This would seem to conflict with the idea of using
the parameters to the restore_command shell script to poll the archive
directory, since the script will hang indefinitely if it cannot find
the file it is asked for.

First off, it seems postgres will always ask for the file
'00000001.history'. As a workaround, the supplied shell script looks
for the string 'history' in the filename parameter. If detected, the
script immediately tries to copy the file, returning cp's error code
to postgres.

I had thought this might be a bug, but see the post:

The documentation doesn't quite make it clear whether or not the shell
script is supposed to determine the expected WAL archive filename by
looking at the sequence of previously archived files, or just by the
passed parameter. This is an issue that requires a bit more
investigation and experimentation.


Charles Duffy

Attachment Content-Type Size application/x-sh 432 bytes

Browse sydpug by date

  From Date Subject
Next Message jarich 2006-10-17 08:33:52 OSDC 2006's fabulous keynotes
Previous Message Gavin Sherry 2006-10-03 03:51:23 Reminder: SydPUG meeting tonight