Re: Backing up large databases

From: <alex(dot)cotarlan(at)thomson(dot)com>
To: <steve(at)jla(dot)com>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Backing up large databases
Date: 2006-04-28 22:46:30
Message-ID: AAE576289623D74CB1DA62F2621A94530A186BAA@TFUSNYNYCMBX01.ERF.THOMSON.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

You could archive the data as you perform the dump

pg_dump <your options> | bzip2 > filename.bz2

bzip2 offers an excellent compression and you could end up with an
archive with less than 10GB

Later you can reload that into another db on a different server

-----Original Message-----
From: pgsql-admin-owner(at)postgresql(dot)org
[mailto:pgsql-admin-owner(at)postgresql(dot)org] On Behalf Of Steve Burrows
Sent: Friday, April 28, 2006 11:58 AM
To: pgsql-admin(at)postgresql(dot)org
Subject: [ADMIN] Backing up large databases

I am struggling to find an acceptable way of backing up a PostgreSQL 7.4
database.

The database is quite large, currently it occupies about 180GB, divided
into two elements, a set of active tables and a set of archive tables
which are only used for insertions.

I ran pg_dump -Fc recently, it took 23.5 hours to run, and output a
single file of 126GB. Obviously as the database continues to grow it
will soon be so large that it cannot be pg_dumped within a day. Running
rsync to do a complete fresh copy of the pgsql file structure took 4
hours, but later that day running another iteration of rsync (which
should have only copied changed files) took 3 hours, and I cannot afford
to have the db down that long.

Anybody with any ideas? The database is being used as the backend for a
mail server, so it has transactions 24 hours a day but is quieter at
night. I want to be able to back it up or replicate it on a daily basis
with minimum downtime so that the mail backlog doesn't get too large.
Ideally I want the first generation of backup/replica going onto the
same machine as the original because the volume of data is such that any
attempt at network or tape backup of the live files will require too
much downtime, once I've got a backup then I can copy that out to other
NAS or tape at leisure.

If anyone has experience of safeguarding a similarly large PostgreSQL
database with minimal downtime I'd be delighted to hear.. The machine
is running 2 Xeons, 4GB ram and a half-terabyte RAID10 array on a DELL
PERC scsi subsystem, with a load average of around 0.5 - 0.6, so it's
not exactly overstretched.

Thanks,

Steve

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Jim Nasby 2006-04-29 01:30:00 Re: Backing up large databases
Previous Message Rafael Martinez 2006-04-28 21:56:18 Re: Backing up large databases