Re: Using streaming replication as log archiving

From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using streaming replication as log archiving
Date: 2010-09-30 14:39:23
Message-ID: AANLkTinkPUR_9jeyMTWiOk-RG8osGKKShMX3ceN8CS9-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 30, 2010 at 10:24 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:

>> That would allow some nice options.  I've been thinking what would
>> be the ideal use of this with our backup scheme, and the best I've
>> thought up would be that each WAL segment file would be a single
>> output stream, with the option of calling a executable (which could
>> be a script) with the target file name and then piping the stream to
>> it.  At 16MB or a forced xlog switch, it would close the stream and
>> call the executable again with a new file name.  You could have a
>> default executable for the default behavior, or just build in a
>> default if no executable is specified.
>
> The problem with that one (which I'm sure is solvable somehow) is how
> to deal with restarts. Both restarts in the middle of a segment
> (happens all the time if you don't have an archive_timeout set), and
> really also restarts between segments. How would the tool know where
> to begin streaming again? Right now, it looks at the files - but doing
> it by your suggestion there are no files to look at. We'd need a
> second script/command to call to figure out where to restart from in
> that case, no?

And then think of the future, when sync rep is in... I'm hoping to be
able to use something like this to do synchrous replication to my
archive (instead of to a live server).

> It should be safe to just rsync the archive directory as it's being
> written by pg_streamrecv. Doesn't that give you the property you're
> looking for - local machine gets data streamed in live, remote machine
> gets it rsynced every minute?

When the "being written to" segmnt copmletes moves to the final
location, he'll get an extra whole "copy" of the file. But of the
"move" can be an exec of his scritpt, the compressed/gzipped final
result shouldn't be that bad. Certainly no worse then what he's
currently getting with archive command ;-) And he's got the
uncompressed incimental updates as they are happening.

a.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yeb Havinga 2010-09-30 14:54:03 Re: is sync rep stalled?
Previous Message Heikki Linnakangas 2010-09-30 14:32:26 Re: Standby registration