Re: Enable data checksums by default

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enable data checksums by default
Date: 2025-05-23 10:05:37
Message-ID: 12035808-43E3-4BBA-BEE5-AA44713C85CC@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 23 May 2025, at 11:55, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 5/23/25 11:25, Daniel Gustafsson wrote:
>>> On 23 May 2025, at 10:10, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>>> Aside from just documenting it, I see two things we could do:
>>>
>>> 1. Have pg_upgrade run initdb for you. It's always felt silly that you need to run initdb with the new version yourself, when there's really only one correct way to do it. pg_upgrade has all the checks to verify that you did it right, so why doesn't it just do it itself? I think that'd be a good long-term solution. Might be too late for 18, but I'm not sure. If someone wrote the patch we could evaluate it. To use that mode, the scripts calling pg_upgrade would need to be changed, though, so we'd perhaps want to do #2 or something else in addition to this.
>>
>> I can see this being desired longer term, but as you mention there is likely to
>> be many moving parts outside of our immediate control making it much harder
>> than just adding the call to initdb. It doesn't seem like a post-beta patch to
>> me given the implications for packagers and others in the ecosystem.
>>
>>> 2. If the new cluster has checksums enabled, but the old one has them disabled, have pg_upgrade disable checksums in the new cluster.
>>
>> IF we do this it should be Very visible, since a user otherwise might think
>> that their upgraded cluster will have checksums since they added them in
>> initdb.
>
> What counts as "very visible"? Would it be fine if the pg_upgrade docs
> say this clearly, and pg_upgrade prints a warning? To me that seems
> sufficient.

I was thinking about a warning during processing.

> TBH I can't quite imagine people expecting checksums to just magically
> appear after upgrade.

It would not be surprised if users expect checksums to be on after reading
(variations of) "checksums are now on by default" messaging.

>> I think we should document how to deal with checksums in upgrades, and perhaps
>> even tweak the errormessage in the pg_upgrade check with explanatory comments
>> if needed, and leave the functionality as is today.
>
> Isn't that just an unnecessary breakage of existing tooling? I mean,
> there's pretty much just one thing the user can do to make it work, and
> that's disabling checksums. Sure, they might also enable checksums on
> the old cluster, but that makes the upgrade much longer, and presumably
> they use pg_upgrade to upgrade quickly.

We already expect the new cluster to be created in the Right Way (which I agree
isn't very userfriendly and should be improved upon) so requiring this to be
Right is in line with existing tooling IMHO (for better or worse). My concern
is that users will think data checksums are enabled after the upgrade, and will
be annoyed when finding out they're not.

--
Daniel Gustafsson

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2025-05-23 10:15:10 Re: [PoC] XMLCast (SQL/XML X025)
Previous Message Tomas Vondra 2025-05-23 09:55:04 Re: Enable data checksums by default