Re: Add LZ4 compression in pg_dump

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, gkokolatos(at)pm(dot)me
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, shiy(dot)fnst(at)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-02-26 14:59:41
Message-ID: 8ffd14fe-0278-7f8d-78d4-472816d06268@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/25/23 06:02, Justin Pryzby wrote:
> I have some fixes (attached) and questions while polishing the patch for
> zstd compression. The fixes are small and could be integrated with the
> patch for zstd, but could be applied independently.
>
> - I'm unclear about get_error_func(). That's called in three places
> from pg_backup_directory.c, after failures from write_func(), to
> supply an compression-specific error message to pg_fatal(). But it's
> not being used outside of directory format, nor for errors for other
> function pointers, or even for all errors in write_func(). Is there
> some reason why each compression method's write_func() shouldn't call
> pg_fatal() directly, with its compression-specific message ?
>

I think there are a couple more places that might/should call
get_error_func(). For example ahwrite() in pg_backup_archiver.c now
simply does

if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;

but perhaps it should call get_error_func() too. There are probably
other places that call write_func() and should use get_error_func().

> - I still think supports_compression() should be renamed, or made into a
> static function in the necessary file. The main reason is that it's
> more clear what it indicates - whether compression is "implemented by
> pgdump" and not whether compression is "supported by this postgres
> build". It also seems possible that we'd want to add a function
> called something like supports_compression(), indicating whether the
> algorithm is supported by the current build. It'd be better if pgdump
> didn't subjugate that name.
>

If we choose to rename this to have pgdump_ prefix, fine with me. But I
don't think there's a realistic chance of conflict, as it's restricted
to pgdump header etc. And it's not part of an API, so I guess we could
rename that in the future if needed.

> - Finally, the "Nothing to do in the default case" comment comes from
> Michael's commit 5e73a6048:
>
> + /*
> + * Custom and directory formats are compressed by default with gzip when
> + * available, not the others.
> + */
> + if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
> + !user_compression_defined)
> {
> #ifdef HAVE_LIBZ
> - if (archiveFormat == archCustom || archiveFormat == archDirectory)
> - compressLevel = Z_DEFAULT_COMPRESSION;
> - else
> + parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
> + &compression_spec);
> +#else
> + /* Nothing to do in the default case */
> #endif
> - compressLevel = 0;
> }
>
>
> As the comment says: for -Fc and -Fd, the compression is set to zlib, if
> enabled, and when not otherwise specified by the user.
>
> Before 5e73a6048, this set compressLevel=0 for -Fp and -Ft, *and* when
> zlib was unavailable.
>
> But I'm not sure why there's now an empty "#else". I also don't know
> what "the default case" refers to.
>
> Maybe the best thing here is to move the preprocessor #if, since it's no
> longer in the middle of a runtime conditional:
>
> #ifdef HAVE_LIBZ
> + if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
> + !user_compression_defined)
> + parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
> + &compression_spec);
> #endif
>
> ...but that elicits a warning about "variable set but not used"...
>

Not sure, I need to think about this a bit.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-02-26 15:07:21 Re: Why the lp_len is 28 not 32?
Previous Message jacktby@gmail.com 2023-02-26 14:35:07 Why the lp_len is 28 not 32?