pg_basebackup: Allow use of arbitrary compression program

From: Michael Harris <harmic(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pg_basebackup: Allow use of arbitrary compression program
Date: 2017-04-07 02:04:29
Message-ID: CADofcAX2=f=hW7-E_sVWObXRN80t0rvMeDr43WbZBrGx-v6y2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Back in pg 9.2, we hacked a copy of pg_basebackup to add a command
line option which would allow the user to specify an arbitrary
external program (potentially including arguments) to be used to
compress the tar backup.

Our motivation was to be able to use pigz (parallel gzip
implementation) to speed up the compression. It also allows using
tools like bzip2, xz, etc instead of the inbuilt zlib.

I never ended up submitting that upstream, but now it looks like I
will have to repeat the exercise for 9.6, so I was wondering if such a
feature would be welcomed.

I found one or two references to people asking for this, eg:
https://www.commandprompt.com/blog/a_pg_basebackup_wish_list/

To do it properly would require:

1) Adding command line option as follows:

-C, --compressprog=PROG
Use supplied program for compression

2) The current logic either uses zlib if compiled in, or offers no
compression at all, controlled by a series of #ifdef/#endif. I would
prefer that the user can either use zlib or an external program
without having to recompile, so I would remove the #ifdefs and replace
them with run time branching.

3) When opening the output file, if the -C option was used, use popen
to open a child process and write to that.

My questions are:
- Has anything like this already been discussed?
- Would this be a welcome contribution?
- Can anyone see any problems with the above approach?

Thanks!

Regards
Mike Harris

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-04-07 02:05:28 Re: No-op case in ExecEvalConvertRowtype
Previous Message David Rowley 2017-04-07 01:59:40 Re: Performance improvement for joins where outer side is unique