Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)

From: "Andres Freund" <andres(at)anarazel(dot)de>
To: "Laurenz Albe" <laurenz(dot)albe(at)cybertec(dot)at>, "Stephen Frost" <sfrost(at)snowman(dot)net>, "Peter Geoghegan" <pg(at)bowt(dot)ie>
Cc: "Michael Banck" <michael(dot)banck(at)credativ(dot)de>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date: 2021-01-08 10:03:39
Message-ID: 37c1f3d9-eaa1-4cad-b87e-e811e3b07ef3@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Jan 8, 2021, at 01:53, Laurenz Albe wrote:
> On Thu, 2021-01-07 at 16:14 -0500, Stephen Frost wrote:
> > I expected there'd be some disagreement on this, but I do continue to
> > feel that it's sensible to enable checksums by default.
>
> +1

I don't disagree with this in principle, but if you want that you need to work on making checksum overhead far smaller. That's doable. Afterwards it makes sense to have this discussion.

> I think the problem here (apart from the original line of argumentation)
> is that there are two kinds of PostgreSQL installations:
>
> - installations done on dubious hardware with minimal tuning
> (the "cheap crowd")
>
> - installations done on good hardware, where people make an effort to
> properly configure the database (the "serious crowd")
>
> I am aware that this is an oversimplification for the sake of the argument.
>
> The voices against checksums on by default are probably thinking of
> the serious crowd.
>
> If checksums were enabled by default, the cheap crowd would benefit
> from the early warnings that something has gone wrong.
>
> The serious crowd are more likely to choose a non-default setting
> to avoid paying the price for a feature that they don't need.

I don't really buy this argument. That way we're going to have an ever growing set of things that need to be tuned to have a database that's usable in an even halfway busy setup. That's unavoidable in some cases, but it's a significant cost across use cases.

Increasing the overhead in the default config from one version to the next isn't great - it makes people more hesitant to upgrade. It's also not a cost you're going to find all that quickly, and it's a really hard to pin down cost.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-01-08 10:50:13 Re: Disable WAL logging to speed up data loading
Previous Message Amit Kapila 2021-01-08 09:59:45 Re: Parallel INSERT (INTO ... SELECT ...)