Re: pg_basebackup: Allow use of arbitrary compression program

From: Michael Harris <harmic(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup: Allow use of arbitrary compression program
Date: 2017-04-27 08:08:48
Message-ID: CADofcAX_bvvVc_f+RVsv=us_pVJy6aU9au3gremALxWs71L9YQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

I have a working prototype now, but there is one aspect I haven't been
able to find the best solution for.

The CLI interface so far has the following new added option:

-C, --compressprog=PRG use supplied external program for compression

An example usage would be:

pg_basebackup -D /home/harmic/tmp/ -C bzip2 -F t

The command string supplied to -C should be a compression command that
reads from stdin and outputs to stdout.

The problem is: when constructing output filename(s), how can we
suffix them with the correct suffix (.gz / .bz2 / .xz / ....) ?

The options I can think of are:

1. Add yet another command line option to specify a suffix
2. Some kind of heuristic to figure it out from the supplied command
string (from known compression programs, but that will never be
complete)
3. Don't worry about it, let the user rename them afterwards, in
which case they would be named xxxx.tar
4. Make the compression command a template, eg. "bzip2 -c > %s.bz2",
so that the template itself will add the suffix

#4 might also be more flexible for tools that don't support output to
stdout, but it is a bit more complex to use.

Any other ideas?

Regards // Mike

On Wed, Apr 12, 2017 at 3:49 PM, Michael Harris <harmic(at)gmail(dot)com> wrote:
> Hi,
>
> Thanks for the feedback!
>
>>> 2) The current logic either uses zlib if compiled in, or offers no
>>> compression at all, controlled by a series of #ifdef/#endif. I would
>>> prefer that the user can either use zlib or an external program
>>> without having to recompile, so I would remove the #ifdefs and replace
>>> them with run time branching.
>>
>>
>> Not sure how that would work or be needed. The reasonable thing would be if zlib
>> is available when building the choices would be "no compression",
>> "zlib compression" or "external compression". If there was no zlib available
>> when building, the choices would be "no compression" or "external compression".
>
> That's exactly how I intend it to work. I had thought that the current
> structure of the code would not allow that, but looking at it more
> closely I see that it does, so I don't have to re-organize the
> #ifdefs.
>
> Regards // Mike

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-04-27 08:37:06 Re: some review comments on logical rep code
Previous Message Dmitriy Sarafannikov 2017-04-27 08:08:30 [PROPOSAL] Use SnapshotAny in get_actual_variable_range