Re: Streaming a base backup from master

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Stephen Frost" <sfrost(at)snowman(dot)net>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Magnus Hagander" <magnus(at)hagander(dot)net>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming a base backup from master
Date: 2010-09-03 15:02:06
Message-ID: 4C80C79E0200002500035171@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> there's a heck of alot of complexity there that we *don't* need.
> rsync is a great tool, don't get me wrong, but let's not try to go
> over our heads here.

Right -- among other things, it checks for portions of a new file
which match the old file at a different location. For example, if
you have a very large text file, and insert a line or two at the
start, it will wind up only sending the new lines. (Well, that and
all the checksums which help it determine that the rest of the file
matches at a shifted location.) I would think that PostgreSQL could
just check whether *corresponding* portions of a file matched, which
is much simpler.

> we already break relations into 1G chunks (when/if they reach that
> size), so you won't necessairly be copying the entire relation if
> you're just doing mtime based or per-file-checksum based
> detection.

While 1GB granularity would be OK, I doubt it's optimal; I think CRC
checks for smaller chunks might be worthwhile. My gut feel is that
somewhere in the 64kB to 1MB range would probably be optimal for us,
although the "sweet spot" will depend on how the database is used.
A configurable or self-adjusting size would be cool.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2010-09-03 15:11:23 Windows Tools
Previous Message Tom Lane 2010-09-03 15:01:36 Re: Streaming a base backup from master