Skip site navigation (1) Skip section navigation (2)

Re: Base Backup Streaming

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Josh Berkus <josh(at)postgresql(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, greg(at)2ndQuadrant(dot)com, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Base Backup Streaming
Date: 2011-01-02 16:44:01
Message-ID: 4D20AB51.7020400@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 02.01.2011 14:47, Dimitri Fontaine wrote:
> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com>  writes:
>> BTW, there's a bunch of replication related stuff that we should work to
>> close, that are IMHO more important than synchronous replication. Like
>> making the standby follow timeline changes, to make failovers smoother, and
>> the facility to stream a base-backup over the wire. I wish someone worked on
>> those...
>
> So, we've been talking about base backup streaming at conferences and we
> have a working prototype.  We even have a needed piece of it in core
> now, that's the pg_read_binary_file() function.  What we still miss is
> an overall design and some integration effort.  Let's design first.

We even have a rudimentary patch to add the required backend support:

http://archives.postgresql.org/message-id/4C80D9B8.2020301@enterprisedb.com

That just needs to be polished into shape, and documentation.

> I propose the following new pg_ctl command to initiate the cloning:
>
>   pg_ctl clone [-D datadir] [-s on|off] [-t filename]  "primary_conninfo"
>
> As far as user are concerned, that would be the only novelty.  Once that
> command is finished (successfully) they would edit postgresql.conf and
> start the service as usual.  A basic recovery.conf file is created with
> the given options, standby_mode is driven by -s and defaults to off, and
> trigger_file defaults to being omitted and is given by -t.  Of course
> the primary_conninfo given on the command line is what ends up into the
> recovery.conf file.
>
> That alone would allow for making base backups for recovery purposes and
> for standby preparing.

+1. Or maybe it would be better make it a separate binary, rather than 
part of pg_ctl.

> To support for this new tool, the simplest would be to just copy what
> I've been doing in the prototype, that is run a query to get the primary
> file listing (per tablespace, not done in the prototype) then get their
> bytea content over the wire.  That means there's no further backend
> support code to write.

It would be so much nicer to have something more integrated, like the 
patch I linked above. Running queries requires connecting to a real 
database, which means that the user needs to have privileges to do that 
and you need to know the name of a valid database. Ideally this would 
all work through a replication connection. I think we should go with 
that from day one.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2011-01-02 16:49:39
Subject: Re: How to know killed by pg_terminate_backend
Previous:From: Simon RiggsDate: 2011-01-02 16:24:15
Subject: Re: Sync Rep Design

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group