Re: wal_compression=zstd

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: wal_compression=zstd
Date: 2022-03-09 13:14:11
Message-ID: 20220309131411.GZ27651@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 04, 2022 at 05:44:06AM -0600, Justin Pryzby wrote:
> On Fri, Mar 04, 2022 at 04:19:32PM +0900, Michael Paquier wrote:
> > On Tue, Feb 22, 2022 at 05:19:48PM -0600, Justin Pryzby wrote:
> >
> > > As writen, this patch uses zstd level=1 (whereas the ZSTD's default compress
> > > level is 6).
> >
> > Why? ZSTD using this default has its reasons, no? And it would be
> > consistent to do the same for ZSTD as for the other two methods.
>
> In my 1-off test, it gets 610/633 = 96% of the benefit at 209/273 = 77% of the
> cost.

Actually, my test used zstd-6, rather than the correct default of 3.

The comparison should have been:

postgres=# SET wal_compression='zstd-1';
postgres=# \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; SELECT pg_stat_reset_shared('wal'); begin; CREATE INDEX ON t(a); rollback; SELECT * FROM pg_stat_wal;
Time: 2074.046 ms (00:02.074)
2763 | 2758 | 6343591 | 0 | 5 | 5 | 0 | 0 | 2022-03-05 05:04:08.599867-06

vs

postgres=# SET wal_compression='zstd-3';
postgres=# \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; SELECT pg_stat_reset_shared('wal'); begin; CREATE INDEX ON t(a); rollback; SELECT * FROM pg_stat_wal;
Time: 2471.552 ms (00:02.472)
wal_records | wal_fpi | wal_bytes | wal_buffers_full | wal_write | wal_sync | wal_write_time | wal_sync_time | stats_reset
-------------+---------+-----------+------------------+-----------+----------+----------------+---------------+-------------------------------
2762 | 2746 | 6396890 | 274 | 274 | 0 | 0 | 0 | 2022-03-05 05:04:31.283432-06

=> zstd-1 actually wrote less than zstd-3 (which is odd) but by an
insignificant amount. It's no surprise that zstd-1 is faster than zstd-3, but
(of course) by a smaller amount than zstd-6.

Anyway there's no compelling reason to not use the default. If we were to use
a non-default default, we'd have to choose between 1 and 2 (or some negative
compression level). My thinking was that zstd-1 would give the lowest-hanging
fruits for zstd, while minimizing performance tradeoff, since WAL affects
interactivity. But choosing between 1 and 2 seems like bikeshedding.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-03-09 13:18:06 Re: logical decoding and replication of sequences
Previous Message Robert Haas 2022-03-09 13:14:10 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints