From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Greg Burd <greg(at)burd(dot)me>, Daniel Gustafsson <daniel(at)yesql(dot)se> |
Cc: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enable data checksums by default |
Date: | 2025-07-31 15:21:11 |
Message-ID: | 58993856-3ce9-4223-9dbe-6b2853a80628@vondra.me |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7/31/25 15:39, Greg Burd wrote:
>
>
>> On Jul 30, 2025, at 8:09 AM, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>>
>>> On 30 Jul 2025, at 11:58, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>>>
>>> On Tue, 2025-07-29 at 20:24 +0200, Tomas Vondra wrote:
>>>> So, what should we do with the PG18 open item? We (the RMT team) would
>>>> like to know if we shall keep the checksums enabled by default, and if
>>>> there's something that still needs to be done for PG18.
>>>
>>> I don't have a strong opinion, but I lean towards having them on
>>> by default.
>>
>> I agree with that, while there might be a lot of cases where disabling
>> checksums is the right move it's still a sane default.
>>
>> --
>> Daniel Gustafsson
>
> I realize I’m late to the conversation, I’ve been lurking...
>
> I agree that enabling checksums by default is the sane default. Databases
> should always make a best effort for data integrity, checksums are a
> positive step in that direction.
>
> I recall a conversation at the last PGConf.dev (2025) with a representative
> from Intel and Jeff Davis (CC’ed) that had to do with checksums and a vast
> performance difference between Intel and AMD the latter winning by a mile.
> I forget the details, maybe Jeff remembers more than I do. I’m not
> suggesting that we disable Intel by default or trying to derail this
> conversation (which appears to be reaching consensus), just raising
> awareness.
>
I don't know the Intel vs. AMD situation exactly, but e.g. [1] does not
suggest AMD wins by a mile. In fact, it suggests Intel does much better
in this particular benchmark (with AVX-512 improvements). Of course,
this is a fairly recent *kernel* improvement, maybe it wouldn't work for
our data checksums that well.
However, I don't think the cost of the checksum calculation itself is
the main concern. It's probably negligible compared to all the other
costs, triggered by checksums - having to WAL-log hint bits, doing more
expensive checks (that's what the btree regression was about), etc.
[1] https://www.phoronix.com/news/Linux-CRC32C-VPCLMULQDQ
cheers
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Sami Imseih | 2025-07-31 15:22:39 | Re: track generic and custom plans in pg_stat_statements |
Previous Message | Tom Lane | 2025-07-31 15:18:03 | Re: Making type Datum be 8 bytes everywhere |