Skip site navigation (1) Skip section navigation (2)

Re: Backing up large databases

From: Robin Iddon <robin(at)edesix(dot)com>
To: Steve Burrows <steve(at)jla(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Backing up large databases
Date: 2006-04-28 18:18:11
Message-ID: 44525C63.1090005@edesix.com (view raw or flat)
Thread:
Lists: pgsql-admin
Hi Steve,

If you can afford to move forwards to 8.x then you can benefit from PITR 
(Point In Time Recovery). 

See http://www.postgresql.org/docs/8.0/interactive/backup-online.html

You can do this without making a PITR live replica (as I understand it, 
you're happy to trust your hardware so you don't need a hot standby 
machine ready to takeover when you have hardware failure).

This mechanism allows the base snapshot(s) to be taken while the 
database is online.  In fact, apart from recovery itself, there is never 
a need to take the database offline.

So you would then need to (a) copy to tape/remote disk/... your periodic 
base backup and (b) copy to tape/remote disk/... each archived WAL file.

Overall you will be writing much more data this way than running pg_dump 
now and then, but provided you can afford the storage you can recover to 
any point in time covered by your archived WAL files.

Sooner or later you are going to hit the magic 250GB point - at this 
point your 0.5TB array isn't big enough to store the live data and a 
base backup.  At this point you will need to look at making the base 
backup happen across the network (or add more disks).  I don't think it 
matters how long it takes to make the base backup, provided you are 
always copying the WAL files too.

Hope this helps,

Robin

Steve Burrows wrote:

> I am struggling to find an acceptable way of backing up a PostgreSQL 
> 7.4 database.
>
> The database is quite large, currently it occupies about 180GB, 
> divided into two elements, a set of active tables and a set of archive 
> tables which are only used for insertions.
>
> I ran pg_dump -Fc recently, it took 23.5 hours to run, and output a 
> single file of 126GB. Obviously as the database continues to grow it 
> will soon be so large that it cannot be pg_dumped within a day. 
> Running rsync to do a complete fresh copy of the pgsql file structure 
> took 4 hours, but later that day running another iteration of rsync 
> (which should have only copied changed files) took 3 hours, and I 
> cannot afford to have the db down that long.
>
> Anybody with any ideas? The database is being used as the backend for 
> a mail server, so it has transactions 24 hours a day but is quieter at 
> night. I want to be able to back it up or replicate it on a daily 
> basis with minimum downtime so that the mail backlog doesn't get too 
> large. Ideally I want the first generation of backup/replica going 
> onto the same machine as the original because the volume of data is 
> such that any attempt at network or tape backup of the live files will 
> require too much downtime, once I've got a backup then I can copy that 
> out to other NAS or tape at leisure.
>
> If anyone has experience of safeguarding a similarly large PostgreSQL 
> database with minimal downtime I'd be delighted to hear..  The machine 
> is running 2 Xeons, 4GB ram and a half-terabyte RAID10 array on a DELL 
> PERC scsi subsystem, with a load average of around 0.5 - 0.6, so it's 
> not exactly overstretched.
>
> Thanks,
>
> Steve



In response to

pgsql-admin by date

Next:From: Tom LaneDate: 2006-04-28 20:03:55
Subject: Re: questions on toast tables
Previous:From: Warren LittleDate: 2006-04-28 16:09:56
Subject: questions on toast tables

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group