Re: Seeking datacenter PITR backup procedures [RESENDING]

From: "Joey K(dot)" <pguser(at)gmail(dot)com>
To: "Bill Moran" <wmoran(at)potentialtech(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Seeking datacenter PITR backup procedures [RESENDING]
Date: 2007-08-20 23:57:01
Message-ID: 467669b30708201657kcc8e627t7af8fcbe563585f3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> I'm guessing you're in a hurry or in a pinch that you need to repost
> after one day on a weekend. I was waiting to let someone more
> knowledgeable answer, but I've had some experience with this, so
> I'll answer to the best of my ability.

I apologize. I wasn't sure if my first email ended up as spam since I saw
other posts getting through. I'll be patient the next time ;-)

> > We have several web applications with Pg 8.2.x running on isolated
> servers
> > (~25). The database size on each machines (du -h pgdata) is ~2 GB. We
> have
> > been using nightly filesystem backup (stop pg, tar backup to ftp, start
> pg)
> > and it worked well.
>
> Any reason why you haven't been using pg_dump? There are a LOT of
> drawbacks to doing filesystem level backups. For example, you can't
> restore to disparate hardware (a filesystem backup made from PG on
> an i386 system won't work on an amd64 system, for example)

We have used pg_dump and like it. The drawback is that it is excruciatingly
slow for backups. Although our databases are ~2GB currently, they will grow
to 6~10 GB per database in the next 6 months and 25~30GB in a year.

Our hardware configuration is managed well and we do not see us switching
architectures often. If we do, we will be performing a pg_dump/restore at
the time.

> We would like to move to PITR backups since the database size will
> increase
> > moving forward and our current backup method might increase server
> > downtimes.
>
> How much do you expect it to increase? 2G is _tiny_ by modern
> standards. Even if you expect it to increase an order of magnitude,
> it's still a reasonable size for pg_dump.
>
> Some huge advantages to pg_dump:
> *) architecture-neutral dumps
> *) No need to stop the database
> *) Extremely simple procedure for backup and restore
> *) Human parseable backups (you may not even need to restore, just
> look through the data to see what was there in some cases)
> *) Can restore a database without shutting down a server, thus you
> can move a database from one server to another (for example)
> without affecting work occurring on the second server.

We have used pg_dump on an offline database. If pg_dump is performed on a
running database, will the recovery be consistent?

>
> > ** START **
> >
> > tmpwal = "/localhost/tmp" # tmp space on server 1 for storing wal
> files
> > before ftp
> > Configure $pgdata/postgresql.conf archive_command = "cp %p $tmpwal/%f"
> >
> .....
> >
> > Recovery on server1 (skeleton commands),
> > % rm -f $tmpwal/*
>
> Why are you removing this day's WAL logs before recovery? If the
> disaster occurs close to your backup time, this will result in the
> loss of an entire day's data changes.

Thanks for pointing this out.

> .....
> > % cp -r pgdata.hosed/pg_xlog pgdata/
> > % echo "cp $tmpwal/%f %p" > pgdata/recovery.conf
> > % start pg (recovery begins)
> >
> > ** END **
>
> Again, how much WAL traffic are you generating? Make sure you have
> enough free space on the recovery system to hold all of the WAL logs
> in the event you need to recover.

How do I measure the WAL traffic generated? Is this the size variance of
pg_xlog per day? We plan our database size to diskspace ratio to be 1:1.5 on
the servers.

Note that this procedure does not do a good job of protecting you
> from catastrophic hardware failure. In the event that your RAID
> system goes insane, you can lose as much as an entire day's worth
> of updates, and there's no reason to.

Currently, a day's worth of data loss is an accepted risk. Once we get our
infrastructure upgraded (moving away from FTP to NFS for these servers) it
will make our backups near realtime.

I _highly_ recommend you stage some disaster scenarios and actually
> use your procedure to restore some databases before you trust it.
> Getting PITR working effectively is tricky, no matter how many questions
> you ask of knowledgeable people. You're going to want to have first-
> hand experience going through the process.

Absolutely. We will run tests to ensure the process works. Sharing
experience implementing PITR in an environment like ours might help avoid
some initial pitfalls.

Joey K

In response to

Browse pgsql-general by date

  From Date Subject
Next Message D. Dante Lorenso 2007-08-21 00:00:32 Re: Using oid as pkey
Previous Message Rainer Bauer 2007-08-20 23:14:23 Re: Using oid as pkey