Re: zstd compression for pg_dump

From: Jacob Champion <jchampion(at)timescale(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, gkokolatos(at)pm(dot)me, Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: zstd compression for pg_dump
Date: 2023-03-10 20:48:13
Message-ID: 3d04afca-7d9d-c90f-5fef-5cf4fdb2173f@timescale.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

This'll need another rebase over the meson ICU changes.

On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion(at)timescale(dot)com>
wrote:
> I did some smoke testing against zstd's GitHub release on Windows. To
> build against it, I had to construct an import library, and put that
> and the DLL into the `lib` folder expected by the MSVC scripts...
> which makes me wonder if I've chosen a harder way than necessary?

A meson wrap made this much easier! It looks like pg_dump's meson.build
is missing dependencies on zstd (meson couldn't find the headers in the
subproject without them).

> Parallel zstd dumps seem to work as expected, in that the resulting
> pg_restore output is identical to uncompressed dumps and nothing
> explodes. I haven't inspected the threading implementation for safety
> yet, as you mentioned.

Hm. Best I can tell, the CloneArchive() machinery is supposed to be
handling safety for this, by isolating each thread's state. I don't feel
comfortable pronouncing this new addition safe or not, because I'm not
sure I understand what the comments in the format-specific _Clone()
callbacks are saying yet.

> And I still wasn't able to test :workers, since
> it looks like the official libzstd for Windows isn't built for
> multithreading. That'll be another day's project.

The wrapped installation enabled threading too, so I was able to try
with :workers=8. Everything seems to work, but I didn't have a dataset
that showed speed improvements at the time. It did seem to affect the
overall compressibility negatively -- which makes sense, I think,
assuming each thread is looking at a separate and/or smaller window.

On to code (not a complete review):

> if (hasSuffix(fname, ".gz"))
> compression_spec.algorithm = PG_COMPRESSION_GZIP;
> else
> {
> bool exists;
>
> exists = (stat(path, &st) == 0);
> /* avoid unused warning if it is not built with compression */
> if (exists)
> compression_spec.algorithm = PG_COMPRESSION_NONE;
> -#ifdef HAVE_LIBZ
> - if (!exists)
> - {
> - free_keep_errno(fname);
> - fname = psprintf("%s.gz", path);
> - exists = (stat(fname, &st) == 0);
> -
> - if (exists)
> - compression_spec.algorithm = PG_COMPRESSION_GZIP;
> - }
> -#endif
> -#ifdef USE_LZ4
> - if (!exists)
> - {
> - free_keep_errno(fname);
> - fname = psprintf("%s.lz4", path);
> - exists = (stat(fname, &st) == 0);
> -
> - if (exists)
> - compression_spec.algorithm = PG_COMPRESSION_LZ4;
> - }
> -#endif
> + else if (check_compressed_file(path, &fname, "gz"))
> + compression_spec.algorithm = PG_COMPRESSION_GZIP;
> + else if (check_compressed_file(path, &fname, "lz4"))
> + compression_spec.algorithm = PG_COMPRESSION_LZ4;
> + else if (check_compressed_file(path, &fname, "zst"))
> + compression_spec.algorithm = PG_COMPRESSION_ZSTD;
> }

This function lost some coherence, I think. Should there be a hasSuffix
check at the top for ".zstd" (and, for that matter, ".lz4")? And the
comment references an unused warning, which is only possible with the
#ifdef blocks that were removed.

I'm a little suspicious of the replacement of supports_compression()
with parse_compress_specification(). For example:

> - errmsg = supports_compression(AH->compression_spec);
> - if (errmsg)
> + parse_compress_specification(AH->compression_spec.algorithm,
> + NULL, &compress_spec);
> + if (compress_spec.parse_error != NULL)
> {
> pg_log_warning("archive is compressed, but this installation does not support compression (%s
> - errmsg);
> - pg_free(errmsg);
> + compress_spec.parse_error);
> + pg_free(compress_spec.parse_error);
> }

The top-level error here is "does not support compression", but wouldn't
a bad specification option with a supported compression method trip this
path too?

> +static void
> +ZSTD_CCtx_setParam_or_die(ZSTD_CStream *cstream,
> + ZSTD_cParameter param, int value, char *paramname)

IMO we should avoid stepping on the ZSTD_ namespace with our own
internal function names.

> + if (cs->readF != NULL)
> + {
> + zstdcs->dstream = ZSTD_createDStream();
> + if (zstdcs->dstream == NULL)
> + pg_fatal("could not initialize compression library");
> +
> + zstdcs->input.size = ZSTD_DStreamInSize();
> + zstdcs->input.src = pg_malloc(zstdcs->input.size);
> +
> + zstdcs->output.size = ZSTD_DStreamOutSize();
> + zstdcs->output.dst = pg_malloc(zstdcs->output.size + 1);
> + }
> +
> + if (cs->writeF != NULL)
> + {
> + zstdcs->cstream = ZstdCStreamParams(cs->compression_spec);
> +
> + zstdcs->output.size = ZSTD_CStreamOutSize();
> + zstdcs->output.dst = pg_malloc(zstdcs->output.size);
> + zstdcs->output.pos = 0;
> + }

This seems to suggest that both cs->readF and cs->writeF could be set,
but in that case, the output buffer gets reallocated.

I was curious about the extra byte allocated in the decompression case.
I see that ReadDataFromArchiveZstd() is null-terminating the buffer
before handing it to ahwrite(), but why does it need to do that?

> +static const char *
> +Zstd_get_error(CompressFileHandle *CFH)
> +{
> + return strerror(errno);
> +}

Seems like this should be using the zstderror stored in the handle?

In ReadDataFromArchiveZstd():

> + size_t input_allocated_size = ZSTD_DStreamInSize();
> + size_t res;
> +
> + for (;;)
> + {
> + size_t cnt;
> +
> + /*
> + * Read compressed data. Note that readF can resize the buffer; the
> + * new size is tracked and used for future loops.
> + */
> + input->size = input_allocated_size;
> + cnt = cs->readF(AH, (char **) unconstify(void **, &input->src), &input->size);
> + input_allocated_size = input->size;
> + input->size = cnt;
This is pretty complex for what it's doing. I'm a little worried that we
let the reallocated buffer escape to the caller while losing track of
how big it is. I think that works today, since there's only ever one
call per handle, but any future refactoring that allowed cs->readData()
to be called more than once would subtly break this code.

In ZstdWriteCommon():

> + /*
> + * Extra paranoia: avoid zero-length chunks, since a zero length chunk
> + * is the EOF marker in the custom format. This should never happen
> + * but...
> + */
> + if (output->pos > 0)
> + cs->writeF(AH, output->dst, output->pos);
> +
> + output->pos = 0;

Elsewhere, output->pos is set to zero before compressing, but here we do
it after, which I think leads to subtle differences in the function
preconditions. If that's an intentional difference, can the reason be
called out in a comment?

--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-03-10 20:49:14 Re: Add SHELL_EXIT_CODE to psql
Previous Message Regina Obe 2023-03-10 20:38:25 RE: Ability to reference other extensions by schema in extension scripts