Re: zstd compression for pg_dump

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>
Subject: Re: zstd compression for pg_dump
Date: 2021-01-04 06:04:57
Message-ID: E3868F55-750B-407A-8C15-6C790B5D4D77@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 4 янв. 2021 г., в 07:53, Justin Pryzby <pryzby(at)telsasoft(dot)com> написал(а):
>
> Note, there's currently several "compression" patches in CF app. This patch
> seems to be independent of the others, but probably shouldn't be totally
> uncoordinated (like adding lz4 in one and ztsd in another might be poor
> execution).
>
> https://commitfest.postgresql.org/31/2897/
> - Faster pglz compression
> https://commitfest.postgresql.org/31/2813/
> - custom compression methods for toast
> https://commitfest.postgresql.org/31/2773/
> - libpq compression

I think that's downside of our development system: patch authors do not want to create dependencies on other patches.
I'd say that both lz4 and zstd should be supported in TOAST, FPIs, libpq, and pg_dump. As to pglz - I think we should not proliferate it any further.
Lz4 and Zstd represent a different tradeoff actually. Basically, lz4 is so CPU-cheap that one should use it whenever they write to disk or network interface. Zstd represent an actual bandwith\CPU tradeoff.
Also, all patchsets do not touch important possibility - preexisting dictionary could radically improve compression of small data (event in pglz).

Some minor notes on patchset at this thread.

Libpq compression encountered some problems with memory consumption which required some extra config efforts. Did you measure memory usage for this patchset?

[PATCH 03/20] Support multiple compression algs/levels/opts..
abtracts -> abstracts
enum CompressionAlgorithm actually represent the very same thing as in "Custom compression methods"

Daniil, is levels definition compatible with libpq compression patch?
+typedef struct Compress {
+ CompressionAlgorithm alg;
+ int level;
+ /* Is a nondefault level set ? This is useful since different compression
+ * methods have different "default" levels. For now we assume the levels
+ * are all integer, though.
+ */
+ bool level_set;
+} Compress;

[PATCH 04/20] struct compressLibs
I think this directive would be correct.
+// #ifdef HAVE_LIBZ?

Here's extra comment
// && errno == ENOENT)

[PATCH 06/20] pg_dump: zstd compression

I'd propose to build with Zstd by default. It seems other patches do it this way. Though, I there are possible downsides.

Thanks for working on this! We will have very IO-efficient Postgres :)

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-01-04 06:52:36 Re: [HACKERS] Custom compression methods
Previous Message torikoshia 2021-01-04 06:04:29 Re: adding wait_start column to pg_locks