Re: Use of rsync for data directory copying

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: David Kerr <dmk(at)mr-paradox(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Use of rsync for data directory copying
Date: 2012-07-15 05:02:01
Message-ID: 20120715050201.GQ1267@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Sat, Jul 14, 2012 at 09:17:22PM -0400, Stephen Frost wrote:
> > So, can you explain which case you're specifically worried about?
>
> OK. The basic problem is that I previously was not clear about how
> reliant our use of rsync (without --checksum) was on the presence of WAL
> replay.

We should only be relying on WAL replay for hot backups which used
pg_start/pg_stop_backup.

> Here is an example from our documentation that doesn't have WAL replay:
>
> http://www.postgresql.org/docs/9.2/static/backup-file.html
>
> Another option is to use rsync to perform a file system backup. This is
> done by first running rsync while the database server is running, then
> shutting down the database server just long enough to do a second rsync.
> The second rsync will be much quicker than the first, because it has
> relatively little data to transfer, and the end result will be
> consistent because the server was down. This method allows a file system
> backup to be performed with minimal downtime.

To be honest, this looks like a recommendation that might have been made
all the way back to before we had hot backups. Technically speaking, it
should work fine to use the above method where the start/stop backup is
only done for the second rsync, if there's a reason to implement such a
system (perhaps the WALs grow too fast or too numerous for a full backup
with rsync between the start_backup and stop_backup?).

> Now, if a write happens in both the first and second half of a second,
> and only the first write is seen by the first rsync, I don't think the
> second rsync will see the write, and hence the backup will be
> inconsistent.

To be more specific, rsync relies on the combination of mtime and size
to tell if the file has been changed or not. In contrast, cp --update
looks like it might only depend on mtime (from reading the cp man page
on a Debian system).

It seems like there could be an issue where PG is writing to a file, an
rsync comes along and copies the file, and then PG writes to that same
file again, after the rsync is done, but within the same second. If the
file size isn't changed by that write, a later rsync might feel that it
isn't necessary to check if the file contents changed. I have to say
that I don't believe I'v ever seen that happen though.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2012-07-15 09:52:03 Re: [v9.3] Row-Level Security
Previous Message Bruce Momjian 2012-07-15 02:57:22 Re: Use of rsync for data directory copying