Re: Backing up large databases

From: "Andy Shellam" <andy(dot)shellam(at)mailnetwork(dot)co(dot)uk>
To: "'Steve Burrows'" <steve(at)jla(dot)com>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Backing up large databases
Date: 2006-04-28 15:34:35
Message-ID: 20060428153430.B545E11F602A@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Steve,

Very interested to hear about your setup, as I have a similar setup (backend
to a mail server/SPAM scanner) although on a much lighter load at the
moment.

My database is only just touching a GB so nothing near the scale of yours!
I use a file-system level backup, and am currently testing a PITR continuous
recovery onto a hot-standby server.

Tar-ing the database directory currently takes about a minute (at 1GB), so
as you can estimate it'd be about 3 hours for yours.

My future plan for when my database grows larger, is with the use of WAL
logging - have a base backup taken on a Sunday morning (our quietest time),
ship this to the hot-standby once a week, and start it off in a recovery
mode (using my rolling-WAL script I'm testing now.) Then throughout the
week, send the WAL logs from the live box as they become available down to
the standby, which then get processed on arrival - these files are 16MB in
size (I believe this can be changed).

The beauty of all this is it doesn't require the database to be taken
off-line, or slowed down.

This is coming from an 8.1 server, I believe it'd be okay for 7.4 but don't
quote me on it.

Regards

Andy

_____

From: pgsql-admin-owner(at)postgresql(dot)org
[mailto:pgsql-admin-owner(at)postgresql(dot)org] On Behalf Of Steve Burrows
Sent: 28 April 2006 4:58 pm
To: pgsql-admin(at)postgresql(dot)org
Subject: [ADMIN] Backing up large databases

I am struggling to find an acceptable way of backing up a PostgreSQL 7.4
database.

The database is quite large, currently it occupies about 180GB, divided into
two elements, a set of active tables and a set of archive tables which are
only used for insertions.

I ran pg_dump -Fc recently, it took 23.5 hours to run, and output a single
file of 126GB. Obviously as the database continues to grow it will soon be
so large that it cannot be pg_dumped within a day. Running rsync to do a
complete fresh copy of the pgsql file structure took 4 hours, but later that
day running another iteration of rsync (which should have only copied
changed files) took 3 hours, and I cannot afford to have the db down that
long.

Anybody with any ideas? The database is being used as the backend for a mail
server, so it has transactions 24 hours a day but is quieter at night. I
want to be able to back it up or replicate it on a daily basis with minimum
downtime so that the mail backlog doesn't get too large. Ideally I want the
first generation of backup/replica going onto the same machine as the
original because the volume of data is such that any attempt at network or
tape backup of the live files will require too much downtime, once I've got
a backup then I can copy that out to other NAS or tape at leisure.

If anyone has experience of safeguarding a similarly large PostgreSQL
database with minimal downtime I'd be delighted to hear.. The machine is
running 2 Xeons, 4GB ram and a half-terabyte RAID10 array on a DELL PERC
scsi subsystem, with a load average of around 0.5 - 0.6, so it's not exactly
overstretched.

Thanks,

Steve
!DSPAM:14,4452344633691957362147!

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2006-04-28 15:48:34 Re: WAL recovery question - 0000001.history
Previous Message Scott Marlowe 2006-04-28 15:24:16 Re: New system recommendations