Re: wal_compression=zstd

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: wal_compression=zstd
Date: 2022-03-04 07:19:32
Message-ID: YiG9hOS/vivHvJny@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 22, 2022 at 05:19:48PM -0600, Justin Pryzby wrote:
> I am not claiming that zstd is generally better for WAL. Rather, if we're
> going to support alternate compression methods, it's nice to give a couple
> options (in addition to pglz). Some use cases would certainly suffer from
> slower WAL. But providing the option will benefit some others. Note that a
> superuser can set wal_compression for a given session - this would probably
> benefit bulk-loading in an otherwise OLTP environment.

Well, I cannot say which one is better either as it depends on your
workload, mostly, but we know for sure that both have benefits, so I
don't mind adding it now. What you are proposing is built on top of
the existing code, making it a very simple change.

> As writen, this patch uses zstd level=1 (whereas the ZSTD's default compress
> level is 6).

Why? ZSTD using this default has its reasons, no? And it would be
consistent to do the same for ZSTD as for the other two methods.

- The supported methods are <literal>pglz</literal> and
- <literal>lz4</literal> (if <productname>PostgreSQL</productname> was
- compiled with <option>--with-lz4</option>). The default value is
- <literal>off</literal>. Only superusers can change this setting.
+ The supported methods are <literal>pglz</literal>, and
+ (if configured when <productname>PostgreSQL</productname>was built)
+ <literal>lz4</literal> and zstd.
+ The default value is <literal>off</literal>.
+ Only superusers can change this setting.

This is making the docs less verbose. I think that you should keep
the mention to respectively --with-lz4 and --with-zstd after each
value.

<para>
Build with <productname>ZSTD</productname> compression support.
+ This enables use of <productname>ZSTD</productname> for
+ compression of WAL data.

This addition is not necessary, see d7a9786.

Not related to your patch, but we have more reasons to add an
check in the block of BKPIMAGE_COMPRESSED() in RestoreBlockImage() of
xlogreader.c to make sure that only one bit is set for the compression
type. We could use a pg_popcount() == 1 for that, returning
report_invalid_record() when we see some corrupted data.

As a whole, 0001 looks pretty good to me after a detailed read (not
tested, though).

> Only 0001 should be reviewed for pg15 - the others are optional/future work.

That's wiser for v15. FWIW, I have the impression that we don't need
most of what's proposed in 0002~.

/* compression methods supported */
-#define BKPIMAGE_COMPRESS_PGLZ 0x04
-#define BKPIMAGE_COMPRESS_LZ4 0x08
-#define BKPIMAGE_COMPRESS_ZSTD 0x10
-
+#define BKPIMAGE_COMPRESS_METHOD1 0x04 /* bits to encode compression method */
+#define BKPIMAGE_COMPRESS_METHOD2 0x08 /* 0=none, 1=pglz, 2=lz4, 3=zstd */

As of 0002. We still have some room for this data and this makes the
code harder to follow, so I'd live this part of the code as it is
proposed in 0001.

0003, defaulting to zstd, and 0004 to extend compression to support a
level would require a lot of benchmarking to be justified. I have
argued against making the code more complicated for such things in the
past, with reloptions. The footprint on the code is much smaller,
here, still..

0007, for ZLIB, does not make sense once one can choose between LZ4
and ZSTD.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-03-04 07:27:25 Re: logical replication empty transactions
Previous Message Masahiko Sawada 2022-03-04 06:55:00 Re: Optionally automatically disable logical replication subscriptions on error