Re: Support for pg_receivexlog --format=plain|tar

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for pg_receivexlog --format=plain|tar
Date: 2016-12-27 09:34:42
Message-ID: CABUevEzEvuG6FgSPm0r8smQVW1rSo1HssgSnkeALAhEv8BCL7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 27, 2016 at 2:23 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> Hi all,
>
> Since 56c7d8d4, pg_basebackup supports tar format when streaming WAL
> records. This has been done by introducing a new transparent routine
> layer to control the method used to fetch WAL walmethods.c: plain or
> tar.
>
> pg_receivexlog does not make use of that yet, but I think that it
> could to allow retention of more WAL history within the same amount of
> disk space. OK, disk space is cheap but for some users things like
> that matters to define a duration retention policy. Especially when
> things are automated around Postgres. I really think that
> pg_receivexlog should be able to support an option like
> --format=plain|tar. "plain" is the default, and matches the current
> behavior. This option is of course designed to match pg_basebackup's
> one.
>
> So, here is in details what would happen if --format=tar is done:
> - When streaming begins, write changes to a tar stream, named
> segnum.tar.partial as long as the segment is not completed.
> - Once the segment completes, rename it to segnum.tar.
> - each individual segment has its own tarball.
> - if pg_receivexlog fails to receive changes in the middle of a
> segment, it begins streaming back at the beginning of a segment,
> considering that the current .partial segment is corrupted. So if
> server comes back online, empty the current .partial file and begin
> writing on it again. (I have found a bug on HEAD in this area
> actually).

> Magnus, you have mentioned me as well that you had a couple of ideas
> on the matter, feel free to jump in and let's mix our thoughts!
>

Yeah, I've been wondering what the actual usecase is here :)

Though I was considering the case where all segments are streamed into the
same tarfile (and then some sort of configurable limit where we'd switch
tarfile after <n> segments, which rapidly started to feel too complicated).

What's the actual advantage of having it wrapped inside a single tarfile?

> There are a couple of things that I have been considering as well for
> pg_receivexlog. Though they are not directly stick to this thread,
> here they are as I don't forget about them:
> - Removal of oldest WAL segments on a partition. When writing WAL
> segments to a dedicated partition, we could have an option that
> automatically removes the oldest WAL segment if the partition is full.
> This triggers once a segment is completed.
> - Compression of fully-written segments. When a segment is finished
> being written, pg_receivexlog could compress them further with gz for
> example. With --format=t this leads to segnum.tar.gz being generated.
> The advantage of doing those two things in pg_receivexlog is
> monitoring. One process to handle them all, and there is no need of
> cron jobs to handle any cleanup or compression.
>

I was at one point thinking that would be a good idea as well, but recently
I've more been thinking that what we should do is implement a
"--post-segment-command", which would act similar to archive_command but
started by pg_receivexlog. This could handle things like compression, and
also integration with external backup tools like backrest or barman in a
cleaner way. We could also spawn this without waiting for it to finish
immediately, which would allow parallellization of the process. When doing
the compression inline that rapidly becomes the bottleneck. Unlike a
basebackup you're only dealing with the need to buffer 16Mb on disk before
compressing it, so it should be fairly cheap.

Another thing I've been considering in the same area would be to add the
ability to write the segments to a pipe instead of a directory. Then you
could just pipe it into gzip without the need to buffer on disk. This would
kill the ability to know at which point we'd sync()ed to disk, but in most
cases so will doing direct gzip. Just means we couldn't support this in
sync mode.

I can see the point of being able to compress the individual segments
directly in pg_receivexlog in smaller systems though, without the need to
rely on an external compression program as well. But in that case, is there
any reason we need to wrap it in a tarfile, and can't just write it to
<segment>.gz natively?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2016-12-27 09:36:52 Re: comments tablecmds.c
Previous Message Rajkumar Raghuwanshi 2016-12-27 09:30:14 Re: Declarative partitioning - another take