From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Aidan Van Dyk <aidan(at)highrise(dot)ca> |
Cc: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Using streaming replication as log archiving |
Date: | 2010-09-30 15:00:58 |
Message-ID: | AANLkTimot6G7mOAvML=ss0u5++t=07_2E2va7aTdDs9F@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 30, 2010 at 16:39, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> On Thu, Sep 30, 2010 at 10:24 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
>>> That would allow some nice options. I've been thinking what would
>>> be the ideal use of this with our backup scheme, and the best I've
>>> thought up would be that each WAL segment file would be a single
>>> output stream, with the option of calling a executable (which could
>>> be a script) with the target file name and then piping the stream to
>>> it. At 16MB or a forced xlog switch, it would close the stream and
>>> call the executable again with a new file name. You could have a
>>> default executable for the default behavior, or just build in a
>>> default if no executable is specified.
>>
>> The problem with that one (which I'm sure is solvable somehow) is how
>> to deal with restarts. Both restarts in the middle of a segment
>> (happens all the time if you don't have an archive_timeout set), and
>> really also restarts between segments. How would the tool know where
>> to begin streaming again? Right now, it looks at the files - but doing
>> it by your suggestion there are no files to look at. We'd need a
>> second script/command to call to figure out where to restart from in
>> that case, no?
>
> And then think of the future, when sync rep is in... I'm hoping to be
> able to use something like this to do synchrous replication to my
> archive (instead of to a live server).
Right, that could be a future enhancement. Doesn't mean we shouldn't
still do our best with the async mode of course :P
>> It should be safe to just rsync the archive directory as it's being
>> written by pg_streamrecv. Doesn't that give you the property you're
>> looking for - local machine gets data streamed in live, remote machine
>> gets it rsynced every minute?
>
> When the "being written to" segmnt copmletes moves to the final
> location, he'll get an extra whole "copy" of the file. But of the
Ah, good point.
> "move" can be an exec of his scritpt, the compressed/gzipped final
> result shouldn't be that bad. Certainly no worse then what he's
> currently getting with archive command ;-) And he's got the
> uncompressed incimental updates as they are happening.
Yeah, it would be trivial to replace the rename() call with a call to
a script that gets to do whatever is suitable to the file. Actually,
it'd probably be better to rename() it *and* call the script, so that
we can continue properly if the script fails.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2010-09-30 15:13:30 | Re: Using streaming replication as log archiving |
Previous Message | Yeb Havinga | 2010-09-30 14:54:03 | Re: is sync rep stalled? |