Re: pg_upgrade and rsync

From: David Steele <david(at)pgmasters(dot)net>
To:
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_upgrade and rsync
Date: 2015-01-28 03:16:48
Message-ID: 54C854A0.3000307@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/27/15 9:51 PM, Bruce Momjian wrote:
>> According to my empirical testing on Linux and OSX the answer is no:
>> rsync does not use sub-second accuracy. This seems to be true even on
>> file systems like ext4 that support millisecond mod times, at least it
>> was true on Ubuntu 12.04 running ext4.
>>
>> Even on my laptop there is a full half-second of vulnerability for
>> rsync. Faster systems may have a larger window.
> OK, bummer. Well, I don't think we ever recommend to run rsync without
> checksums, but the big problem is that rsync doesn't do checksums by
> default. :-(
>
> pg_upgrade recommends using two rsyncs:
>
> To make a valid copy of the old cluster, use <command>rsync</> to create
> a dirty copy of the old cluster while the server is running, then shut
> down the old server and run <command>rsync</> again to update the copy
> with any changes to make it consistent. You might want to exclude some
>
> I am afraid that will not work as it could miss changes, right? When
> would the default mod-time checking every be safe?
>
According to my testing the default mod-time checking is never
completely safe in rsync. I've worked around this in PgBackRest by
building the manifest and then waiting until the start of the next
second before starting to copy. It was the only way I could make the
incremental backups reliable without requiring checksums (which are
optional as in rsync for performance). Of course, you still have to
trust the clock for this to work.

This is definitely an edge case. Not only does the file have to be
modified in the same second *after* rsync has done the copy, but the
file also has to not be modified in *any other subsequent second* before
the next incremental backup. If the file is busy enough to have a
collision with rsync in that second, then it is very likely to be
modified before the next incremental backup which is generally a day or
so later. And, of course, the backup where the issue occurs is fine -
it's the next backup that is invalid.

However, the hot/cold backup scheme as documented does make the race
condition more likely since the two backups are done in close proximity
temporally. Ultimately, the most reliable method is to use checksums.

For me the biggest issue is that there is no way to discover if a db in
consistent no matter how much time/resources you are willing to spend.
I could live with the idea of the occasional bad backup (since I keep as
many as possible), but having no way to know whether it is good or not
is very frustrating. I know data checksums are a step in that
direction, but they are a long way from providing the optimal solution.
I've implemented rigorous checksums in PgBackRest but something closer
to the source would be even better.

--
- David Steele
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2015-01-28 04:03:20 Re: a fast bloat measurement tool (was Re: Measuring relation free space)
Previous Message Bruce Momjian 2015-01-28 02:51:43 Re: pg_upgrade and rsync