Re: Streaming a base backup from master

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming a base backup from master
Date: 2010-09-03 11:28:51
Message-ID: AANLkTi=JvFQyzXRYxRb1n01HWCejwnda4RFWukDbezWM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 3, 2010 at 13:19, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> It's been discussed before that it would be cool if you could stream a new
> base backup from the master server, via libpq. That way you would not need
> low-level filesystem access to initialize a new standby.
>
> Magnus mentioned today that he started hacking on that, and coincidentally I
> just started experimenting with it yesterday as well :-). So let's get this
> out on the mailing list.
>
> Here's a WIP patch. It adds a new "TAKE_BACKUP" command to the replication
> command set. Upon receiving that command, the master starts a COPY, and
> streams a tarred copy of the data directory to the client. The patch
> includes a simple command-line tool, pg_streambackup, to connect to a server
> and request a backup that you can then redirect to a .tar file or pipe to
> "tar x".
>
> TODO:
>
> * We need a smarter way to do pg_start/stop_backup() with this. At the
> moment, you can only have one backup running at a time, but we shouldn't
> have that limitation with this built-in mechanism.
>
> * The streamed backup archive should contain all the necessary WAL files
> too, so that you don't need to set up archiving to use this. You could just
> point the tiny client tool to the server, and get a backup archive
> containing everything that's necessary to restore correctly.

For this last point, this should of course be *optional*, but it would
be very good to have that option (and probably on by default).

Couple of quick comments that I saw directly differentiated from the
code I have :-) We chatted some about it already, but it should be
included for others...

* It should be possible to pass the backup label through, not just
hardcode it to basebackup

* Needs support for tablespaces. We should either follow the symlinks
and pick up the files, or throw an error if it's there. Silently
delivering an incomplete backup is not a good thing :-)

* Is there a point in adapting the chunk size to the size of the libpq buffers?

FWIW, my implementation was as a user-defined function, which has the
advantage it can run on 9.0. But most likely this code can be ripped
out and provided as a separate backport project for 9.0 if necessary -
no need to have separate codebases.

Other than that, our code is remarkably similar.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2010-09-03 11:30:02 Re: Streaming a base backup from master
Previous Message Dave Page 2010-09-03 11:28:12 Re: Streaming a base backup from master