Re: rsync and streaming replication

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Jean-Armel Luce <jaluce06(at)gmail(dot)com>
Cc: Scott Ribe <scott_ribe(at)elevated-dev(dot)com>, "[ADMIN]" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: rsync and streaming replication
Date: 2011-11-15 20:31:38
Message-ID: CAF6yO=3G-P0KgSH=Eiqw3Se38Civ7QktKF8q2-Zs0Rgy47=6Ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

> This afternoon, I have again sent some updates requests, which were
> replicated to the sslaves.
>  :
> - I am looking modification of modification dates and checksums of 2 tables
> among my 6000 tables :
> For each file, the checksum is the same on all the slaves, but different
> from the checksum of the master.
> For each file, the modification time is different on each node. (see below)

you are probably hit by "hint bits": they are not WAL-logged with 9.0
so the files can be different just because of "select" you issued on
master and/or standby.

>
> So if I want to promote one slave as the master, it will not need to copy
> data from the new master to the previous slaves with rsync, but it will copy
> all the files from the new master to the old master (which is now a slave).
>
> I shall try tomorrow topromote again a slave, and I shall rsync withh
> --checksum.
> I don't think that it is -a is very useful withh --checksum (no need to
> preserve modification times). Do you agree ?

-a is a good shortcut, chaging the modtime is not a real cost. So
despite you don't need to keep the mtime, there is no benefit in not
keeping it :)
Well, after re-reading rsync manual, and taking into account Scott answers:
use --ignore-time will make all files rsynced (thus it will check each
block and copy only the blocks which differ)
use --checksum will make all files to be read and checksumed in both
side before trying to rsynced them (and check each block and copy them
if required). Obvisouly when the files do not have the same size, they
are rsynced without a 'global' checksum.

It is safer (someone can say paranoid, which is correct) to use one of
those in the PostgreSQL case where we have a size limit and where
files can be modified in both side without affecting their size. So
there is a hight risk to have the same size on source and destination
and a very low risk to have the same modification time when the
content is changed. I admit the risk is very low and in practice it
should not happen. As many things should not happen...

If you want to reduce the re-rsync step, you may want to try to have
similar files in both places by using vacuum freeze before initial
rsync, or something like that (so hint bits are set before rsync).

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Greg Smith 2011-11-15 20:38:58 Re: cancelled queries on standby
Previous Message Jean-Armel Luce 2011-11-15 20:11:37 Re: rsync and streaming replication