Re: pg_basebackup: Allow use of arbitrary compression program

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Harris <harmic(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup: Allow use of arbitrary compression program
Date: 2017-04-09 19:33:08
Message-ID: CABUevEwzCGY+rZo3rm89F9V5ED+g0WtEfCP0R=VVUK_gUf3iPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 7, 2017 at 4:04 AM, Michael Harris <harmic(at)gmail(dot)com> wrote:

> Hello,
>
> Back in pg 9.2, we hacked a copy of pg_basebackup to add a command
> line option which would allow the user to specify an arbitrary
> external program (potentially including arguments) to be used to
> compress the tar backup.
>
> Our motivation was to be able to use pigz (parallel gzip
> implementation) to speed up the compression. It also allows using
> tools like bzip2, xz, etc instead of the inbuilt zlib.
>
> I never ended up submitting that upstream, but now it looks like I
> will have to repeat the exercise for 9.6, so I was wondering if such a
> feature would be welcomed.
>
> I found one or two references to people asking for this, eg:
> https://www.commandprompt.com/blog/a_pg_basebackup_wish_list/
>
> To do it properly would require:
>
> 1) Adding command line option as follows:
>
> -C, --compressprog=PROG
> Use supplied program for compression
>
> 2) The current logic either uses zlib if compiled in, or offers no
> compression at all, controlled by a series of #ifdef/#endif. I would
> prefer that the user can either use zlib or an external program
> without having to recompile, so I would remove the #ifdefs and replace
> them with run time branching.
>

Not sure how that would work or be needed. The reasonable thing would be if
zlib is available when building the choices would be "no compression",
"zlib compression" or "external compression". If there was no zlib
available when building, the choices would be "no compression" or "external
compression".

Or maybe I'm misunderstanding what you're saying?

> 3) When opening the output file, if the -C option was used, use popen
> to open a child process and write to that.
>
> My questions are:
> - Has anything like this already been discussed?
>

I think it has, but not in detail.

> - Would this be a welcome contribution?
>

Yes, I definitely think this would be useful.

> - Can anyone see any problems with the above approach?
>

One thing to consider is the work done recently to ensure that the output
is properly synchronized when written to disk. I don't think it's
reasonable to expect that from an external compression, but if it can be
made optional that'd be good. Or at least be careful not to break the
current one.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-04-09 20:47:30 Re: problems compiling in solaris 10
Previous Message Jaime Casanova 2017-04-09 18:57:31 problems compiling in solaris 10