Re: Refactoring of compression options in pg_basebackup

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Georgios Kokolatos <gkokolatos(at)pm(dot)me>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
Subject: Re: Refactoring of compression options in pg_basebackup
Date: 2022-01-07 06:43:06
Message-ID: Ydfg+r7SNZ+HSJst@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 06, 2022 at 09:27:19AM -0500, Robert Haas wrote:
> Did you mean that -z would be a synonym for --compression-method=gzip?
> It doesn't really make sense for -Z to be that, unless it's also
> setting the compression level.

Yes, I meant "-z", not "-Z", to be a synonym of
--compression-method=gzip. Sorry for the typo.

> My objection to --compress=$LEVEL is that the compression level seems
> like it ought to rightfully be subordinate to the choice of algorithm.
> In general, there's no reason why a compression algorithm has to offer
> a choice of compression levels at all, or why they have to be numbered
> 0 through 9. For example, lz4 on my system claims to offer compression
> levels from 1 through 12, plus a separate set of "fast" compression
> levels starting with 1 and going up to an unspecified number. And then
> it also has options to favor decompression speed, change the block
> size, and a few other parameters. We don't necessarily want to expose
> all of those options, but we should structure things so that we could
> if it became important. The way to do that is to make the compression
> algorithm the primary setting, and then anything else you can set for
> that compressor is somehow a subordinate setting.

For any compression method, that maps to an integer, so.. But I am
not going to fight hard on that.

> Put another way, we don't decide first that we want to compress with
> level 7, and then afterward decide whether that's gzip, lz4, or bzip2.
> We pick the compressor first, and then MAYBE think about changing the
> compression level.

Which is why things should be checked once all the options are
processed. I'd recommend that you read the option patch a bit more,
that may help.

> Well what I was looking at was this:
>
> - printf(_(" -Z, --compress=0-9 compress tar output with given
> compression level\n"));
> + printf(_(" -Z, --compress=1-9 compress tar output with given
> compression level\n"));
> + printf(_(" --compression-method=METHOD\n"
> + " method to compress data\n"));
>
> That seems to show that, post-patch, the argument to -Z would be a
> compression level, even if --compression-method were something other
> than gzip.

Yes, after the patch --compress would be a compression level. And, if
attempting to use with --compression-method set to "none", or
potentially "lz4", it would just fail. If not using this "gzip", the
compression level is switched to Z_DEFAULT_COMPRESSION. That's this
area of the patch, FWIW:
+ /*
+ * Compression-related options.
+ */
+ switch (compression_method)

> It's possible that I haven't read something carefully enough, but to
> me, what I said seems to be a straightforward conclusion based on
> looking at the usage help in the patch. So if I came to the wrong
> conclusion, perhaps that usage help isn't reflecting the situation you
> intend to create, or not as clearly as it ought.

Perhaps the --help output could be clearer, then. Do you have a
suggestion?

Bringing walmethods.c at the same page for the directory and the tar
methods was my primary goal here, and the tests are a bonus, so I've
applied this part for now, leaving pg_basebackup alone until we figure
out the layer of options we should use. Perhaps it would be better to
revisit that stuff once the server-side compression has landed.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-01-07 06:49:47 Re: Add jsonlog log_destination for JSON server logs
Previous Message Amit Kapila 2022-01-07 06:35:41 Re: row filtering for logical replication