Re: Synch Rep for CommitFest 2009-07

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Rick Gigger <rick(at)alpinenetworking(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synch Rep for CommitFest 2009-07
Date: 2009-07-16 17:09:19
Message-ID: 407d949e0907161009s122c95b3ia567316ab50ff7be@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 16, 2009 at 4:41 PM, Heikki
Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Rick Gigger wrote:
>> If you use an rsync like algorithm for doing the base backups wouldn't
>> that increase the size of the database for which it would still be
>> practical to just re-sync?  Couldn't you in fact sync a very large
>> database if the amount of actual change in the files was a small
>> percentage of the total size?
>
> It would certainly help to reduce the network traffic, though you'd
> still have to scan all the data to see what has changed.

The fundamental problem with pushing users to start over with a new
base backup is that there's no relationship between the size of the
WAL and the size of the database.

You can plausibly have a system with extremely high transaction rate
generating WAL very quickly, but where the whole database fits in a
few hundred megabytes. In that case you could be behind by only a few
minutes and have it be faster to take a new base backup.

Or you could have a petabyte database which is rarely updated. In
which case it might be faster to apply weeks' worth of logs than to
try to take a base backup.

Only the sysadmin is actually going to know which makes more sense.
Unless we start tieing WAL parameters to the database size or
something like that.

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2009-07-16 17:10:04 Re: WIP: generalized index constraints
Previous Message Andres Freund 2009-07-16 16:49:08 Re: Review remove {join, from}_collapse_limit, add enable_join_ordering