Howto: Using PITR recovery for standby replication

From: "Benjamin Krajmalnik" <kraj(at)illumen(dot)com>
To: <pgsql-admin(at)postgresql(dot)org>
Subject: Howto: Using PITR recovery for standby replication
Date: 2006-04-21 02:58:51
Message-ID: 8511B4970E0D124898E973DF496F9B4357E0F4@stash.stackdump.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I am running PostgreSQL 8.1.3 on Windows.

The project itself is a real time data acquisition system, so it cannot be taken off-line for backups.

I have tried using pg_dump, but discovered that the backup was not a consistent backup. The application currently inserts about 1 million rows per day (will ramp to about 5 million when in full production). All of the insertion of the data is controlledby a master stored procedure which inserts rows into a raw log, and dynamically aggregates data into anclliary tables which enable us to see statistical data of the systems being monitored withut having to mine the raw data.

The original prototype of this system was running onder MS SQL Server 2000, but once PostgreSQL 8.1 was released I decided to port it. The biggest challenge which I have right now is to ensure that we can have data recovery in case of a catastrophic failure in the primary system - with the ability to load a "cold spare".

Back to the problem I faced when testing backups with pg_dump, it appears that the backup was not a consistent backup of the data. For example, sequences which are used by some tables bo longer held the correct values (the tables now held higher values), and ths would indicate to me that the backup of an aggregate table may not match the underlying raw data which created it.

As such, my only option is to create a "hot backup" using PITR. I would like to know if the following scenario would work:

A secondary server with the same version of PostgreSQL is loaded on a secondary server. The PostgreSQL service on the second box would not be running. I would issue a pg_start_backup. I would then copy the the database directory to the second box. Issue a pg_stop_backup. I would delete the WAL logs form the secondary box's pg_xlog. I wuld then copy the archived WAL's as well as the current WAL to the secondary pg_xlog location.

In could then backup the snapshot from the secondatry box to a lesser media for archival purposes, and in the event of a problem, I would simply start the service on he secondary box.

Is this a workable solution? Or, better yet, could th secondary be ive and, after the initial backup and restore from the main box, could replication be acomplished by somehow moving the new archived logs to the secondary box, thereby creating a timed replication (forexample, every hour we cold create anothe backup ad just move the WAL file oer, since the state of the secondary database shoud reflect the state of the previous bacukp)?

While I absolutely love PotgreSQL, and together with some of the add-ons (pgAdmin, pgAgent, the add-ons from EMS) there is alost nothing missing, the relative difficulty of backing up / restoring vis-a-vis the commercial solutions is frustrating. Not hat it is a PostgreQL problem, but rather a learning curve, but until I get this working atisfactorily I am a bit worried.

As always, any insight and assistance ill be deeply appreciated.

Regards,

Benjamin

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2006-04-21 03:09:58 Re: Howto: Using PITR recovery for standby replication
Previous Message Jim C. Nasby 2006-04-21 02:57:44 Re: Invalid Page Headers