Skip site navigation (1) Skip section navigation (2)

Re: [PERFORMANCE] Buying hardware

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Jeff <threshar(at)torgo(dot)978(dot)org>
Cc: David Rees <drees76(at)gmail(dot)com>,Matthew Wakeling <matthew(at)flymine(dot)org>,Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, A B <gentosaker(at)gmail(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORMANCE] Buying hardware
Date: 2009-01-27 01:07:33
Message-ID: 497E5E55.4020601@postnewspapers.com.au (view raw or flat)
Thread:
Lists: pgsql-performance
Jeff wrote:

> If you use PITR + rsync you can create a binary snapshot of the db, so 
> restore time is simply how long it takes to untar / whatever it into 
> place.  Our backup script basically does:
> 
> archive backup directory
> pg_start_backup
> rsync
> pg_stop_backup
> 
> voila. I have 2 full copies of the db.

Note that this does NOT protect you against a Pg bug, a filesystem 
issue, or a RAID / disk media issue that results in corruption or damage 
to the database that isn't immediately detected.

I personally wouldn't want to rely on PITR alone. I take periodic SQL 
dumps as well, using PITR more for disaster recovery in case of 
software/user error, so I can roll back to when the error occurred 
rather than the potentially much earlier time of the last backup.

A dump is much more likely to trip over a problem, so it's good to run 
periodically just to make sure the entire DB is readable. It's also more 
likely to be restorable if something goes horribly wrong.

Personally I might be OK with WAL-archiving based backups if I was using 
a warm-standby server in continuous recovery mode that'd notify me if 
anything went wrong with the restore. I'm much more comfortable having 
an SQL dump around, though.

> You could even expand it a bit 
> and after the rsync & friends have it fire up the instance and run 
> pg_dump against it for a pg_restore compatible dump "just in case".

Yep, that's what I do.

> It takes a long time to restore a 300GB db, even if you cheat and 
> parallelify some of it. 8.4 may get a  pg_restore that can load in 
> parallel - which will help somewhat.

It's a great pity that pg_dump can't dump in parallel, though. It makes 
sense given that Pg has no mechanism for one backend to "join" anothers' 
existing transaction, and without that you'd risk inconsistent 
snapshots, but it's still a bit of a pity.

Is it actually that hard to let one backend join another's (strictly 
read-only) transaction? The first backend will be ensuring that tuples 
in the xmin/xmax range required by the transaction aren't VACUUMed away etc.

--
Craig Ringer

In response to

pgsql-performance by date

Next:From: davidDate: 2009-01-27 03:59:44
Subject: Re: [PERFORMANCE] Buying hardware
Previous:From: M. Edward (Ed) BoraskyDate: 2009-01-26 21:56:48
Subject: Re: postgresql 8.3 tps rate

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group