Re: Streaming base backups

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming base backups
Date: 2011-01-06 23:01:52
Message-ID: AANLkTikvmop2T=ukPutVTHAJQwEkQxRj-kq=utkKvYD6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 6, 2011 at 23:57, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 05.01.2011 15:54, Magnus Hagander wrote:
>>
>> Attached is an updated streaming base backup patch, based off the work
>> that Heikki started.
>> ...
>> I've implemented a frontend for this in pg_streamrecv, based on the
>> assumption
>> that we wanted to include this in bin/ for 9.1 - and that it seems like a
>> reasonable place to put it. This can obviously be moved elsewhere if we
>> want to.
>
> Hmm, is there any point in keeping the two functionalities in the same
> binary, taking the base backup and streaming WAL to an archive directory?
> Looks like the only common option between the two modes is passing the
> connection string, and the verbose flag. A separate pg_basebackup binary
> would probably make more sense.

Yeah, once I broke things apart for better readability, I started
leaning in that direction as well.

However, if you consider the things that Dimiti mentioned about
streaming at the same time as downloading, having them in the same one
would make more sense. I don't think that's something for now,
though..

>> That code needs a lot more cleanup, but I wanted to make sure I got the
>> backend
>> patch out for review quickly. You can find the current WIP branch for
>> pg_streamrecv on my github page at
>> https://github.com/mhagander/pg_streamrecv,
>> in the branch "baserecv". I'll be posting that as a separate patch once
>> it's
>> been a bit more cleaned up (it does work now if you want to test it,
>> though).
>
> Looks like pg_streamrecv creates the pg_xlog and pg_tblspc directories,
> because they're not included in the streamed tar. Wouldn't it be better to
> include them in the tar as empty directories at the server-side? Otherwise
> if you write the tar file to disk and untar it later, you have to manually
> create them.

Yeah, good point. Originally, the tar code (your tar code, btw :P)
didn't create *any* directories, so I stuck it in there. I agree it
should be moved to the backend patch now.

> It would be nice to have an option in pg_streamrecv to specify the backup
> label to use.

Agreed.

> An option to stream the tar to stdout instead of a file would be very handy
> too, so that you could pipe it directly to gzip for example. I realize you
> get multiple tar files if tablespaces are used, but even if you just throw
> an error in that case, it would be handy.

Makes sense.

>> * Suggestion from Heikki: perhaps at some point we're going to need a full
>>   bison grammar for walsender commands.
>
> Maybe we should at least start using the lexer; we're not quite there to
> need a full-blown grammar yet, but even a lexer might help.

Might. I don't speak flex very well, so I'm not really sure what that
would mean.

> BTW, looking at the WAL-streaming side of pg_streamrecv, if you start it
> from scratch with an empty target directory, it needs to connect to
> "postgres" database, to run pg_current_xlog_location(), and then reconnect
> in replication mode. That's a bit awkward, there might not be a "postgres"
> database, and even if there is, you might not have the permission to connect
> to it. It would be much better to have a variant of the START_REPLICATION
> command at the server-side that begins streaming from the current location.
> Maybe just by leaving out the start-location parameter.

Agreed. That part is unchanged from the one that runs against 9.0
though, where that wasn't a possibility. But adding something like
that to the walsender in 9.1 would be good.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-01-07 00:31:20 Re: Fixing GIN for empty/null/full-scan cases
Previous Message Heikki Linnakangas 2011-01-06 22:57:13 Re: Streaming base backups