hot spare / log shipping work on

From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: hot spare / log shipping work on
Date: 2004-08-13 14:52:47
Message-ID: 411CD5BF.6010509@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,
I'm having some spare time and I'm testing what Tom Lane was
suggesting:

===========================================================================
Tom Lane wrote:
1. You set up WAL archiving on the master, and arrange to ship copies of
completed segment files to the slave.

2. You take an on-line backup (ie, tar dump) on the master, and restore
it on the slave.

3. You set up a recover.conf file with the restore_command being some
kind of shell script that knows where to look for the shipped-over
segment files, and also has a provision for being signaled to stop
tracking the shipped-over segments and come alive.

4. You start the postmaster on the slave. It will try to recover. Each
time it asks the restore_command script for another segment file, the
script will sleep until that segment file is available, then return it.

5. When the master dies, you signal the restore_command script that it's
time to come alive. It now returns "no such file" to the patiently
waiting postmaster, and within seconds you have a live database on the
slave.
===========================================================================

How I'm expanding the point above:

1) This is the easy part and the task can be accomplished with a simple:

cp %p /mnt/server/archivedir/%f

2) Easy task

3+4) I already wrote the restore_command that do the trick, it take 3
~ parameters: <source> <target> <partial_directory>

~ The partial_directory will contain the partial_segment shipped each
~ minute, and a file "alive" that is "touch"ed periodically

~ The script when called perform these tasks:
a) Check if the file requested exist

a1) If exist check that is a 16MB file ( the request can
~ arrive during the copy ), if is not 16MB sleep for
~ 1 second and retry. This is done for 20 try, after
~ this time out the script exit with a nonzero return time.
~ When the file reach a size of 16MB ( or is already a 16MB
~ file then it's copied with: cp <source> <target>

~ a2) If the file not exist this mean that is not yet recycled and
~ is a partial file present on the partial directory,
~ check if the "alive" file is older then 2 minutes.
~ a21) If the file is older than 2 minutes I assume that
~ the master is dead: I move the partial WAL file
~ present in the partial directory to the <target>
~ directory, and I exit returning a 0 ( the asked file
~ was the partial ). If the partial file do not exist
~ this mean that in the previous call I already moved the
~ partial file and then I have to exit with a nonzero value.

a22) If the file is newer than 2 minutes I assume that
~ the master is alive and I sleep for 5 seconds and I
~ restart from the point a)

5) If the master dies the daemon ( a running shell script ) that is running on
~ the master will not touch the "alive" file.
~ If the master is alive the daemon copy the current WAL file in the <partial
~ directory> with the name <current_name>.tmp and after the copy:
~ mv <current_name>.tmp <current_name>.partial

Do you see any pitfall on it ?
I think in an hour I'll test it and I let you know.

Regard
Gaetano Mendola

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBHNW+7UpzwH2SGd4RAsMBAJ9diSsgG3y6rnueWbZLOvjzko07OwCdGaxE
f8mwC9A4sDJ8nN+XhcUKjP8=
=9SrG
-----END PGP SIGNATURE-----

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Schuchardt 2004-08-13 15:02:00 Postgres 8.0 -> BEGIN EXCEPTION END Syntax????
Previous Message Arash Zaryoun 2004-08-13 14:43:43 Weird Database Performance problem!