Re: Enable data checksums by default

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>, Greg Burd <greg(at)burd(dot)me>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enable data checksums by default
Date: 2025-07-31 22:10:30
Message-ID: f952295d46c589703a6c5a1c91466e85d48bfd1c.camel@j-davis.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2025-07-31 at 17:21 +0200, Tomas Vondra wrote:
> On 7/31/25 15:39, Greg Burd wrote:
> > I recall a conversation at the last PGConf.dev (2025) with a
> > representative
> > from Intel and Jeff Davis (CC’ed) that had to do with checksums and
> > a vast
> > performance difference between Intel and AMD the latter winning by
> > a mile.
>
> I don't know the Intel vs. AMD situation exactly, but e.g. [1] does
> not
> suggest AMD wins by a mile. In fact, it suggests Intel does much
> better
> in this particular benchmark (with AVX-512 improvements). Of course,
> this is a fairly recent *kernel* improvement, maybe it wouldn't work
> for
> our data checksums that well.
>
> However, I don't think the cost of the checksum calculation itself is
> the main concern. It's probably negligible compared to all the other
> costs, triggered by checksums - having to WAL-log hint bits, doing
> more
> expensive checks (that's what the btree regression was about), etc.

The issue Greg and I discussed, explained to me earlier by Andres, was
a memory bandwidth issue.

IIRC (Andres please correct me): The new IO infrastructure enables us
to bypass a memory copy (from userspace to kernel space) when writing
out a page. Unfortunately, checksums require reading the data to
calculate the checksum, which effectively defeats that optimization.

Those memory copies mostly happen in the bgwriter, where the page isn't
generally in the cache, which means that memory bandwidth can become
the bottleneck. Intel seems to have poor per-core memory bandwidth
compared with AMD:

https://sites.utexas.edu/jdm4372/2023/04/25/the-evolution-of-single-core-bandwidth-in-multicore-processors/

so it's more likely to become the bottleneck on Intel.

That lead to an interesting discussion about calculating the checksum
on a page in the backend eagerly when it dirties a page, while it's
still in cache. As you point out, that's quite cheap.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Burd, Greg 2025-07-31 22:39:51 Re: Enable data checksums by default
Previous Message Jacob Champion 2025-07-31 21:48:08 Re: libpq: Process buffered SSL read bytes to support records >8kB on async API