Re: Support for pg_receivexlog --format=plain|tar

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for pg_receivexlog --format=plain|tar
Date: 2016-12-27 12:16:22
Message-ID: CAB7nPqTPa6D2-T-tFf7ffpnqooN5ytd8b80Mspic7fBSCiUSOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 27, 2016 at 6:34 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Tue, Dec 27, 2016 at 2:23 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
> wrote:
>> Magnus, you have mentioned me as well that you had a couple of ideas
>> on the matter, feel free to jump in and let's mix our thoughts!
>
>
> Yeah, I've been wondering what the actual usecase is here :)

There is value to compress segments finishing with trailing zeros,
even if they are not the species with the highest representation in
the WAL archive.

> Though I was considering the case where all segments are streamed into the
> same tarfile (and then some sort of configurable limit where we'd switch
> tarfile after <n> segments, which rapidly started to feel too complicated).
>
> What's the actual advantage of having it wrapped inside a single tarfile?

I am advocating for one tar file per segment to be honest. Grouping
them makes the failure handling more complicated when connection to
the server is killed, or the replication stream is cut. Well, not
really complicated actually, because I think that you would need to
drop in the segment folder a status file with enough information to
let pg_receivexlog know from where in the tar file it needs to
continue writing. If a new tarball is created for each segment,
deciding from where to stream after a connection failure is just a
matter of doing what is done today: having a look at the completed
segments and begin streaming from the incomplete/absent one.

>> There are a couple of things that I have been considering as well for
>> pg_receivexlog. Though they are not directly stick to this thread,
>> here they are as I don't forget about them:
>> - Removal of oldest WAL segments on a partition. When writing WAL
>> segments to a dedicated partition, we could have an option that
>> automatically removes the oldest WAL segment if the partition is full.
>> This triggers once a segment is completed.
>> - Compression of fully-written segments. When a segment is finished
>> being written, pg_receivexlog could compress them further with gz for
>> example. With --format=t this leads to segnum.tar.gz being generated.
>> The advantage of doing those two things in pg_receivexlog is
>> monitoring. One process to handle them all, and there is no need of
>> cron jobs to handle any cleanup or compression.
>
> I was at one point thinking that would be a good idea as well, but recently
> I've more been thinking that what we should do is implement a
> "--post-segment-command", which would act similar to archive_command but
> started by pg_receivexlog. This could handle things like compression, and
> also integration with external backup tools like backrest or barman in a
> cleaner way. We could also spawn this without waiting for it to finish
> immediately, which would allow parallellization of the process. When doing
> the compression inline that rapidly becomes the bottleneck. Unlike a
> basebackup you're only dealing with the need to buffer 16Mb on disk before
> compressing it, so it should be fairly cheap.

I did not consider the case of barman and backrest to be honest,
having the view of 2ndQ folks and David would be nice here. Still, the
main idea behind those done by pg_receivexlog's process would be to
not spawn a new process. I have a class of users that care about
things that could hang, they play a lot with network-mounted disks...
And VMs of course.

> Another thing I've been considering in the same area would be to add the
> ability to write the segments to a pipe instead of a directory. Then you
> could just pipe it into gzip without the need to buffer on disk. This would
> kill the ability to know at which point we'd sync()ed to disk, but in most
> cases so will doing direct gzip. Just means we couldn't support this in sync
> mode.

Users piping their data don't care about reliability anyway. So that
is not a problem.

> I can see the point of being able to compress the individual segments
> directly in pg_receivexlog in smaller systems though, without the need to
> rely on an external compression program as well. But in that case, is there
> any reason we need to wrap it in a tarfile, and can't just write it to
> <segment>.gz natively?

You mean having a --compress=0|9 option that creates individual gz
files for each segment? Definitely we could just do that. It would be
a shame though to not use the WAL methods you have introduced in
src/bin/pg_basebackup, with having the whole set tar and tar.gz. A
quick hack in pg_receivexlog has showed me that segments are saved in
a single tarball, which is not cool. My feeling is that using the
existing infrastructure, but making it pluggable for individual files
(in short I think that what is needed here is a way to tell the WAL
method to switch to a new file when a segment completes) would really
be the most simple one in terms of code lines and maintenance.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-12-27 12:51:06 Re: Push down more full joins in postgres_fdw
Previous Message Andres Freund 2016-12-27 11:31:42 Re: Potential data loss of 2PC files