Re: Changing the state of data checksums in a running cluster

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bernd Helmle <mailings(at)oopsware(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Changing the state of data checksums in a running cluster
Date: 2026-03-27 23:13:40
Message-ID: 68e8b8f2-c40a-4f0a-9d3f-5092d89d7afe@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Daniel,

On 3/27/26 23:03, Daniel Gustafsson wrote:
> The attached rebase contains lots more polish, mostly renaming variable names
> for clarity, tidying up comments and documentation and some smaller bits of
> cleanup like moving more code out of xlog.c.
>
> This version runs all the tests in a normal test-run, with a few of them pared
> down with larger runs gated by PG_TEST_EXTRA. I thinkt the tests are still too
> expensive in the event of getting committed, but it's helpful to have them
> during dev and test. Executing pgbench sometimes fails in CI but I've been
> unable to reproduce that so not entirely sure what is going on there.
>
> Heikki, Andres and Tomas; as you have been reviewing this patchset, what do you
> feel is left for considering this for commit? (Apart from figuring out the CI
> test thing mentioned above which I think is a buildsystem issue.) I think 0001
> could be considered independently of 0002 and is cleanup in it's own right.
>

Nothing particular comes to my mind, really. All the suggestions and
ideas I've had regarding the patch I've already shared during the
earlier reviews/testing. I'll take a look over the weekend, but I don't
expect to find anything, especially now that Heikki reviewed it.

The only thing that bothered me were the checksum failures in VM/FSM.
The VM failures were fixed (right?), and the FSM failures are expected
because we don't WAL-log that (and so no FPIs either).

That's a bit unfortunate, but it's not a new issue or the fault of this
patch, and it doesn't make it any worse. Fine with me.

However, won't this be a problem for the TAP tests? I mean, what happens
after a crash/restart, that might have corrupted the FSM? Won't that
result in a test failure?

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2026-03-27 23:14:14 Add pg_stat_autovacuum_priority
Previous Message Jacob Champion 2026-03-27 23:03:16 Re: Custom oauth validator options