Re: Different compression methods for FPI

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Different compression methods for FPI
Date: 2021-05-19 03:06:18
Message-ID: 20210519030618.GS373@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 17, 2021 at 04:44:11PM +0900, Michael Paquier wrote:
> On Sun, Mar 21, 2021 at 02:30:04PM -0500, Justin Pryzby wrote:
>
> For this patch, this is going to require a bit more in terms of library
> linking as the block decompression is done in xlogreader.c, so that's one
> thing to worry about.

I'm not sure what you mean here ?

> + {
> + {"wal_compression_method", PGC_SIGHUP, WAL_SETTINGS,
> + gettext_noop("Set the method used to compress full page images in the WAL."),
> + NULL
> + },
> + &wal_compression_method,
> + WAL_COMPRESSION_PGLZ, wal_compression_options,
> + NULL, NULL, NULL
> + },
> The interface is not quite right to me. I think that we should just
> change wal_compression to become an enum, with extra values for pglz
> and the new method. "on" would be a synonym for "pglz".

Andrey gave a reason in March:

| I hope one day we will compress all WAL, not just FPIs. Advanced archive management tools already do so, why not compress it in walwriter?
| When this will be implemented, we could have wal_compression = {off, fpi, all}.

> +/* This is a mapping indexed by wal_compression */
> +// XXX: maybe this is better done as a GUC hook to assign the 1)
> method; and 2) level
> +struct walcompression walmethods[] = {
> + {"pglz", WAL_COMPRESSION_PGLZ},
> + {"zlib", WAL_COMPRESSION_ZLIB},
> +};
> Don't think you need a hook here, but zlib, or any other method which
> is not supported by the build, had better not be listed instead. This
> ensures that the value cannot be assigned if the binaries don't
> support that.

I think you're confusing the walmethods struct (which is unconditional) with
wal_compression_options, which is conditional.

> The patch set is a gathering of various things, and not only things
> associated to the compression method used for FPIs.
> What is the point of that in patch 0002?
> > Subject: [PATCH 03/12] Make sure published XIDs are persistent
> Patch 0003 looks unrelated to this thread.

..for the reason that I gave:

| And 2ndary patches from another thread to allow passing recovery tests.
|These two patches are a prerequisite for this patch to progress:
| * Run 011_crash_recovery.pl with wal_level=minimal
| * Make sure published XIDs are persistent

> > Subject: [PATCH 04/12] wal_compression_method: default to zlib..
>
> Patch 0004 could happen, however there are no reasons given why this
> is adapted. Changing the default is not going to happen for the time
> release where this feature is added, anyway.

From the commit message:
| this is meant to exercise the CIs, and not meant to be merged

> + default:
> + report_invalid_record(record, "image at %X/%X is compressed with unsupported codec, block %d (%d/%s)",
> + (uint32) (record->ReadRecPtr >> 32),
> + (uint32) record->ReadRecPtr,
> + block_id,
> + compression_method,
> + wal_compression_name(compression_method));
> + return false;
> In xlogreader.c, the error message is helpful this way. However, we
> would not know which compression method failed if there is a
> decompression failure for a method supported by the build restoring
> this block. That would be good to add.

I don't undersand you here - that's what wal_compression_name is for ?
2021-05-18 21:38:04.324 CDT [26984] FATAL: unknown compression method requested: 2(lz4)

> I think that what we actually need for this thread are patches 0001,
> 0005 and 0006 merged together to study first the performance we have
> with each one of the compression methods proposed, and then let's just
> pick one. Reading around, zstd and zlib compresse more but take
> longer. LZ4 is faster than the others, but can compress less.
> With limited bandwidth, less data makes sense, and my guess is that
> most users care most about the speed of recovery if we can afford
> speed with an acceptable compression ratio.

I don't see why we'd add a guc for configuration compression but not include
the 30 lines of code needed to support a 3rd method that we already used by the
core server.

--
Justin

Attachment Content-Type Size
v7-0001-Allow-alternate-compression-methods-for-wal_compr.patch text/x-diff 15.0 KB
v7-0002-Run-011_crash_recovery.pl-with-wal_level-minimal.patch text/x-diff 1002 bytes
v7-0003-Make-sure-published-XIDs-are-persistent.patch text/x-diff 7.2 KB
v7-0004-wal_compression_method-default-to-zlib.patch text/x-diff 1.5 KB
v7-0005-re-add-wal_compression_method-lz4.patch text/x-diff 6.7 KB
v7-0006-add-wal_compression_method-zstd.patch text/x-diff 17.5 KB
v7-0007-Default-to-LZ4.patch text/x-diff 2.8 KB
v7-0008-Default-to-zstd.patch text/x-diff 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-05-19 03:26:07 Re: Multiple pg_waldump --rmgr options
Previous Message Michael Paquier 2021-05-19 03:03:50 Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS