Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Michael Banck <michael(dot)banck(at)credativ(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date: 2021-01-07 06:42:17
Message-ID: CAA4eK1KAAihwaPLZdo2y9pLUVpXfmHaB9yghrv9bnL1MxnDStA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 7, 2021 at 3:32 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Wed, Jan 6, 2021 at 1:29 PM Michael Banck <michael(dot)banck(at)credativ(dot)de> wrote:
> > That one seems to be 5min everywhere, and one can change it except on
> > Azure.
>
> Okay, thanks for clearing that up. Looks like all of the big 3 cloud
> providers use Postgres checksums in a straightforward way.
>

But they might have done something to reduce the impact of enabling
checksums like by using a different checksum (for data blocks) and or
compression (for WAL) technique.

> I don't have much more to say on this thread. I am -1 on the current
> proposal to enable page-level checksums by default.
>

-1 from me too with the current impact on performance and WAL it can
have. I was looking at some old thread related to this topic and came
across the benchmarking done by Tomas Vondra [1]. It clearly shows
that enabling checksums can have a huge impact on init time, WAL, and
TPS.

Having said that, if we really want to enable checksums, can't we
think of improving performance when it is enabled? I could think of
two things to improve (a) a better algorithm for wal compression
(currently we use pglz), this will allow us to enable wal_compression
at least when data_checksums are enabled (b) a better technique for
checksums to reduce the cost of PageSetChecksumCopy. I don't have good
ideas to offer to improve things in these two areas but I think it is
worth investigating if we want to enable checksums.

[1] - https://www.postgresql.org/message-id/20190330192543.GH4719%40development

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-01-07 06:53:52 Re: [PATCH] Feature improvement for CLOSE, FETCH, MOVE tab completion
Previous Message Bharath Rupireddy 2021-01-07 06:32:58 Re: Parallel Inserts in CREATE TABLE AS