Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> there's a heck of alot of complexity there that we *don't* need.
> rsync is a great tool, don't get me wrong, but let's not try to go
> over our heads here.
Right -- among other things, it checks for portions of a new file
which match the old file at a different location. For example, if
you have a very large text file, and insert a line or two at the
start, it will wind up only sending the new lines. (Well, that and
all the checksums which help it determine that the rest of the file
matches at a shifted location.) I would think that PostgreSQL could
just check whether *corresponding* portions of a file matched, which
is much simpler.
> we already break relations into 1G chunks (when/if they reach that
> size), so you won't necessairly be copying the entire relation if
> you're just doing mtime based or per-file-checksum based
While 1GB granularity would be OK, I doubt it's optimal; I think CRC
checks for smaller chunks might be worthwhile. My gut feel is that
somewhere in the 64kB to 1MB range would probably be optimal for us,
although the "sweet spot" will depend on how the database is used.
A configurable or self-adjusting size would be cool.
In response to
pgsql-hackers by date
|Next:||From: David Fetter||Date: 2010-09-03 15:11:23|
|Subject: Windows Tools|
|Previous:||From: Tom Lane||Date: 2010-09-03 15:01:36|
|Subject: Re: Streaming a base backup from master |