Re: pg_basebackup for streaming base backups

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup for streaming base backups
Date: 2011-01-15 21:14:19
Message-ID: AANLkTikWbn=UWg_u4gyyL04ooCo584dhE2ZiPjcjfbUi@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 15, 2011 at 21:16, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr> wrote:
> Hi,
>
> I have an unexpected 5 mins window to do a first reading of the patch,
> so here goes the quick doc and comments proof reading of it. :)

:-)

> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> This patch creates pg_basebackup in bin/, being a client program for
>> the streaming base backup feature.
>
> Great!  We have pg_ctl init[db], I think we want pg_ctl clone or some
> other command here to call the binary for us.  What do you think?

That might be useful, but I think we need to settle on the
pg_basebackup contents itself first.

Not sure pg_ctl clone would be the proper name, since it's not
actually a clone at this point (it might be with the second patch I
ust posted that includes the WAL files)

>> One thing I'm thinking about - right now the tool just takes -c
>> <conninfo> to connect to the database. Should it instead be taught to
>> take the connection parameters that for example pg_dump does - one for
>> each of host, port, user, password? (shouldn't be hard to do..)
>
> Consistency is good.
>
> Now, basic first patch reading level review:
>
> I think doc/src/sgml/backup.sgml should include some notes about how
> libpq base backup streaming compares to rsync and the like in term of
> efficiency or typical performances, when to prefer which, etc.  I'll see
> about doing some tests next week.

Yeah, the whole backup chapter may well need some more work after this.

> +      <term><option>--basedir=<replaceable class="parameter">directory</replaceable></option></term>
>
> That should be -D --pgdata, for consistency with pg_dump.

pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb?

> On a quick reading it's unclear from the docs alone how -d and -t leave
> together.  It seems like the options are exclusive but I'd have to ask…

They are. The docs clearly say "Only one of <literal>-d</> and
<literal>-t</> can be specified"

> + * The file will be named base.tar[.gz] if it's for the main data directory
> + * or <tablespaceoid>.tar[.gz] if it's for another tablespace.
>
> Well we have UNIQUE, btree (spcname), so maybe we can use that here?

We could, but that would make it more likely to run into encoding
issues and such - do we restrict what can be in a tablespace name?

Also with a tar named by the oid, you *can* untar it into a directory
in pg_tblspc to recover from if you have to.

Another option, I think Heikki mentioned this on IM at some point, is
to do something like name it <oid>-<name>.tar. That would give us best
of both worlds?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-01-15 21:17:34 Re: LOCK for non-tables
Previous Message Magnus Hagander 2011-01-15 21:07:56 Include WAL in base backup