Re: refactoring basebackup.c (zstd workers)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Jeevan Ladhe <jeevanladhe(dot)os(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: refactoring basebackup.c (zstd workers)
Date: 2022-03-17 19:41:30
Message-ID: 20220317194130.GM28503@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 9178c779ba..00c593f1af 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2731,14 +2731,24 @@ The commands accepted in replication mode are:
+ <para>
+ For <literal>gzip</literal> the compression level should be an

gzip comma

+++ b/src/backend/replication/basebackup.c
@@ -18,6 +18,7 @@

#include "access/xlog_internal.h" /* for pg_start/stop_backup */
#include "common/file_perm.h"
+#include "common/backup_compression.h"

alphabetical

- errmsg("unrecognized compression algorithm: \"%s\"",
+ errmsg("unrecognized compression algorithm \"%s\"",

Most other places seem to say "compression method". So I'd suggest to change
that here, and in doc/src/sgml/ref/pg_basebackup.sgml.

- if (o_compression_level && !o_compression)
+ if (o_compression_detail && !o_compression)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("compression level requires compression")));

s/level/detail/

/*
+ * Basic parsing of a value specified for -Z/--compress.
+ *
+ * We're not concerned here with understanding exactly what behavior the
+ * user wants, but we do need to know whether the user is requesting client
+ * or server side compression or leaving it unspecified, and we need to
+ * separate the name of the compression algorithm from the detail string.
+ *
+ * For instance, if the user writes --compress client-lz4:6, we want to
+ * separate that into (a) client-side compression, (b) algorithm "lz4",
+ * and (c) detail "6". Note, however, that all the client/server prefix is
+ * optional, and so is the detail. The algorithm name is required, unless
+ * the whole string is an integer, in which case we assume "gzip" as the
+ * algorithm and use the integer as the detail.
..
*/
static void
+parse_compress_options(char *option, char **algorithm, char **detail,
+ CompressionLocation *locationres)

It'd be great if this were re-usable for wal_compression, which I hope in pg16 will
support at least level=N. And eventually pg_dump. But those clients shouldn't
accept a client/server prefix. Maybe the way to handle that is for those tools
to check locationres and reject it if it was specified.

+ * We're not concerned with validation at this stage, so if the user writes
+ * --compress client-turkey:sandwhich, the requested algorithm is "turkey"
+ * and the detail string is "sandwhich". We'll sort out whether that's legal

sp: sandwich

+ WalCompressionMethod wal_compress_method;

This is confusingly similar to src/include/access/xlog.h:WalCompression.
I think someone else mentioned this before ?

+ * A compression specification specifies the parameters that should be used
+ * when * performing compression with a specific algorithm. The simplest

star

+/*
+ * Get the human-readable name corresponding to a particular compression
+ * algorithm.
+ */
+char *
+get_bc_algorithm_name(bc_algorithm algorithm)

should be const ?

+ /* As a special case, the specification can be a bare integer. */
+ bare_level = strtol(specification, &bare_level_endp, 10);

Should this call expect_integer_value()?
See below.

+ result->parse_error =
+ pstrdup("found empty string where a compression option was expected");

Needs to be localized with _() ?
Also, document that it's pstrdup'd.

+/*
+ * Parse 'value' as an integer and return the result.
+ *
+ * If parsing fails, set result->parse_error to an appropriate message
+ * and return -1.
+ */
+static int
+expect_integer_value(char *keyword, char *value, bc_specification *result)

-1 isn't great, since it's also an integer, and, also a valid compression level
for zstd (did you see my message about that?). Maybe INT_MIN is ok.

+{
+ int ivalue;
+ char *ivalue_endp;
+
+ ivalue = strtol(value, &ivalue_endp, 10);

Should this also set/check errno ?
And check if value != ivalue_endp ?
See strtol(3)

+char *
+validate_bc_specification(bc_specification *spec)
...
+ /*
+ * If a compression level was specified, check that the algorithm expects
+ * a compression level and that the level is within the legal range for
+ * the algorithm.

It would be nice if this could be shared with wal_compression and pg_dump.
We shouldn't need multiple places with structures giving the algorithms and
range of compression levels.

+ unsigned options; /* OR of BACKUP_COMPRESSION_OPTION constants */

Should be "unsigned int" or "bits32" ?

The server crashes if I send an unknown option - you should hit that in the
regression tests.

$ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp --no-sync --no-manifest --compress=server-lz4:a |wc -c
TRAP: FailedAssertion("pointer != NULL", File: "../../../../src/include/utils/memutils.h", Line: 123, PID: 8627)
postgres: walsender pryzbyj [local] BASE_BACKUP(ExceptionalCondition+0xa0)[0x560b45d7b64b]
postgres: walsender pryzbyj [local] BASE_BACKUP(pfree+0x5d)[0x560b45dad1ea]
postgres: walsender pryzbyj [local] BASE_BACKUP(parse_bc_specification+0x154)[0x560b45dc5d4f]
postgres: walsender pryzbyj [local] BASE_BACKUP(+0x43d56c)[0x560b45bc556c]
postgres: walsender pryzbyj [local] BASE_BACKUP(SendBaseBackup+0x2d)[0x560b45bc85ca]
postgres: walsender pryzbyj [local] BASE_BACKUP(exec_replication_command+0x3a2)[0x560b45bdddb2]
postgres: walsender pryzbyj [local] BASE_BACKUP(PostgresMain+0x6b2)[0x560b45c39131]
postgres: walsender pryzbyj [local] BASE_BACKUP(+0x40530e)[0x560b45b8d30e]
postgres: walsender pryzbyj [local] BASE_BACKUP(+0x408572)[0x560b45b90572]
postgres: walsender pryzbyj [local] BASE_BACKUP(+0x4087b9)[0x560b45b907b9]
postgres: walsender pryzbyj [local] BASE_BACKUP(PostmasterMain+0x1135)[0x560b45b91d9b]
postgres: walsender pryzbyj [local] BASE_BACKUP(main+0x229)[0x560b45ad0f78]

This is interpreted like client-gzip-1; should multiple specifications of
compress be prohibited ?

| src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp --no-sync --no-manifest --compress=server-lz4 --compress=1

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2022-03-17 19:46:34 Re: Proposal for internal Numeric to Uint64 conversion function.
Previous Message Peter Geoghegan 2022-03-17 19:34:50 Re: ICU for global collation