Re: Streaming a base backup from master

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Streaming a base backup from master
Date: 2010-09-05 15:51:38
Message-ID: 20100905155138.GA7583@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 04, 2010 at 02:42:40PM +0100, Greg Stark wrote:
> On Fri, Sep 3, 2010 at 8:30 PM, Martijn van Oosterhout
> <kleptog(at)svana(dot)org> wrote:
> >
> > rsync is not rocket science. All you need is for the receiving end to
> > send a checksum for each block it has. The server side does the same
> > checksum and for each block sends back "same" or "new data".
>
> Well rsync is closer to rocket science than that. It does rolling
> checksums and can handle data being moved around, which vacuum does do
> so it's probably worthwhile.

Not sure. When vacuum moves rows around the chance that it will move
rows as a block and that the line pointers will be the same is
practically nil. I don't think rsync will pick up on blocks the size of
a typical row. Vacuum changes the headers so you never have a copied
block.

> *However* I tihnk you're all headed in the wrong direction here. I
> don't think rsync is what anyone should be doing with their backups at
> all. It still requires scanning through *all* your data even if you've
> only changed a small percentage (which it seems is the use case you're
> concerned about) and it results in corrupting your backup while the
> rsync is in progress and having a window with no usable backup. You
> could address that with rsync --compare-dest but then you're back to
> needing space and i/o for whole backups every time even if you're only
> changing small parts of the database.

If you're working from a known good version of the database at some
point, yes you are right you have more interesting options. If you
don't you want something that will fix it.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-09-05 16:58:29 Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Previous Message Tom Lane 2010-09-05 15:35:45 Re: Functional dependencies and GROUP BY