Server move using rsync

From: "Stephen Denne" <stephen(dot)denne(at)datam(dot)co(dot)nz>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Server move using rsync
Date: 2011-10-28 01:37:18
Message-ID: F0238EBA67824444BC1CB4700960CB480F76F0D5@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We're intending to move a 470GB PostgreSQL 8.3.13 database using the following technique from http://www.postgresql.org/docs/8.3/interactive/backup-file.html
 
"Another option is to use rsync to perform a file system backup. This is done by first running rsync while the database server is running, then shutting down the database server just long enough to do a second rsync. The second rsync will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime."

Except that we plan on an initial rsync which we think might take a couple of days, then subsequent daily rsyncs for up to a week to keep it up to date till we stop the old database, rsync again, and start the new database.

A very rough approximation of our database would be half a dozen large tables taking up 1/3 of the disk space, and lots of indexes on those tables taking the other 2/3 of the space.

If we assume usage characteristics of:
Much less than 1% of indexed data changing per day, with almost all of those updates being within the 1% of most recently added data.
Much less than 1% of historical indexed data being deleted per day with most of the deletions expected to affect sets of contiguous file pages.
About 1% of new indexed data added per day

I'm curious of the impact of vacuum (automatic and manual) during that process on expected amount of work rsync will have to do, and time it will take, and on what the update pattern is on files of Btree indexes.

Is it worth making sure vacuum is not run, in order to reduce the amount of files that change during that period?

Do a number of additions evenly spread through the domain of an indexed field's values result in localized changes to the indexes files, or changes throughout the files?

How about for additions to the end of the domain of an indexed field's values (e.g. adding current dates)?

Is there any way during that week, that we can verify whether our partially completed database move process is going to result in a database that starts up ok?

Regards, Stephen Denne.
This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by replying immediately, destroy it and do not copy, disclose or use it in any way.

Please consider the environment before printing this e-mail
__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/dmzmessaging.htm for details.
__________________________________________________________________

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Josh Kupershmidt 2011-10-28 02:25:09 Re: PostgreSQL at LISA in Boston: Dec. 7-8
Previous Message Joshua D. Drake 2011-10-27 23:54:07 Re: PostGIS in a commercial project