Re: rsync and streaming replication

From: Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: "[ADMIN]" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: rsync and streaming replication
Date: 2011-11-15 15:59:10
Message-ID: 640CEEBE-CB12-4FC9-9758-0319A49E7ACF@elevated-dev.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Nov 15, 2011, at 8:30 AM, Cédric Villemain wrote:

> Seriously, I did. Is my post "just for the value : rsync --checksum is
> the option to use to prevent copying of **identical files**" incorrect
> ?

It's at least incomplete and somewhat misleading. But I guess you could say the same about my post; we seem to be focusing on 2 different aspects of its behavior ;-)

> OP contains "It looks that all database files do not have the same
> modification date in the master node and in the slave nodes, so the
> rsync copies quite all the database from the new master to the
> slaves."

Yes, and rsync should only be copying changed blocks in that case, of which you are aware, but which OP did not seem to realize.

> One benefit is when files are in fact identical on both side, so that
> rsync does not have to process checksum for each blocks on source and
> destination. (when there are few changes, we expect rsync to copy only
> those few changes, with or without --checksum).

Well, but it does calculate checksums on the entire contents of both files (which takes as much I/O and about as much CPU as calculating checksums for each block), even when timestamps & sizes are identical.

For the OP's case, identical files with differing timestamps, the only potential savings is from not exchanging checksums over the network, which is not likely to offer any meaningful improvement in performance, which still leaves open the question as to why rsync is so slow in that, when we know it is usually relatively fast to sync two servers with few differences.

Would be nice to actually hear from OP regarding file sizes/counts & network bandwidth & disks & and so on ;-)

--
Scott Ribe
scott_ribe(at)elevated-dev(dot)com
http://www.elevated-dev.com/
(303) 722-0567 voice

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Jean-Armel Luce 2011-11-15 18:39:52 Re: rsync and streaming replication
Previous Message Cédric Villemain 2011-11-15 15:30:58 Re: rsync and streaming replication