Re: pg_upgrade and rsync

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: pg_upgrade and rsync
Date: 2015-01-23 20:52:46
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 23, 2015 at 02:34:36PM -0500, Stephen Frost wrote:
> > Why? Just rsync the new data directory onto the old directory on the
> > standbys. That's fine and simple.
> That still doesn't address the need to use --size-only, it would just
> mean that you don't need to use -H. If anything the -H part is the
> aspect which worries me the least about this approach.

It took me a while to understand what Stephen was saying, so let me
explain the details so everyone can get on the same page.

First, let's look at the downsides of using non-hardlink rsync against a
slave cluster, whether we run pg_upgrade on the slave or not:

o must preserve db directory and relfilenodes (4 new things for
pg_upgrade to preserve)
o must checksum files because there is no way to distinguish
user tables/indexes (which don't need to be copied) from system
tables/indexes (which must be copied so it is in sync with
the master)
o must use log_wal_hints when the slave is installed so the
checksums match

So, even if if all the checksums work, it will be slow/expensive.

Stephen's idea is quite interesting. You run pg_upgrade on the master,
then, before you start the new server, you run rsync with special flags
and sync the old _and_ new clusters on the master with just the old
cluster on the standby (yeah, odd). Yes, this is very odd, and where I
got lost too.

First, this only works when pg_upgrade is run in --link mode. What
rsync --hard-links --size-only is going to do is record which files have
hard links, remember their inode numbers, and cross-reference the
hard-linked files. When doing the rsync remote comparisons, the
master's old relfilenodes will match the standby's old relfilenodes, and
because we are using --size-only, they will be considered identical and
not copied, or even checksumed. When it goes to do the standby's new
cluster, none of the directories will exist, so they will all be copied
along with the system objects (they are small), but the user
tables/indexes will be identified as already existing in the slave's old
cluster so it will hard-link to those standby's old cluster files. Once
rsync is complete, you can delete the old cluster on master and standby.
This is effectively duplicating the way pg_upgrade works.

What is interesting is that this will work on any version of pg_upgrade,
with no modifications, as long as link mode is used. You _cannot_ run
initdb on the standby, as this will create system files that would
prevent the master's system files from being copied. This is also going
to remove your recovery.conf on the standby, and replace your
postgresql.conf with the master's, so any modifications you made to the
standby will have to be saved and restored in to the new cluster before

I plan to run some tests soon to verify this method works, and if so, I
can document it in the pg_upgrade manual page.

Bruce Momjian <bruce(at)momjian(dot)us>

+ Everyone has their own god. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-01-23 20:58:57 Re: REVIEW: Track TRUNCATE via pgstat
Previous Message Stephen Frost 2015-01-23 20:15:15 Re: pgaudit - an auditing extension for PostgreSQL