| From: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
|---|---|
| To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Bernd Helmle <mailings(at)oopsware(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Changing the state of data checksums in a running cluster |
| Date: | 2026-04-02 18:48:17 |
| Message-ID: | B3162DDF-435A-4BFD-A99B-32A038F57051@yesql.se |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On 2 Apr 2026, at 11:27, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
Thanks for looking!
> It'd be good to print a LOG message when the checksums state changes, to have a trail in the log of when checksums were enabled/disabled. Something like:
>
> LOG: enabling checksums was requested, starting checksum calculation
> ...
> LOG: checksums calculation finished, checksums are now enabled
Done.
> On 02/04/2026 02:01, Daniel Gustafsson wrote:
>> + if (result == DATACHECKSUMSWORKER_FAILED)
>> + {
>> + /*
>> + * Disable checksums on cluster, because we failed one of the
>> + * databases and this is an all or nothing process.
>> + */
>> + SetDataChecksumsOff();
>> + ereport(ERROR,
>> + errcode(ERRCODE_INSUFFICIENT_RESOURCES),
>> + errmsg("data checksums failed to get enabled in all databases, aborting"),
>> + errhint("The server log might have more information on the cause of the error."));
>> + }
>
> This got me thinking, what happens if the the data checksums launcher encounters some other error, for example if you SIGTERM it? The system is left in 'inprogress-on' state, but because the launcher is gone it will never finish and 'pg_stat_progress_data_checksums' will be empty.
>
> Perhaps launcher_exit() should call SetDataChecksumsOff()?
I made launcher_exit call SetDataChecksumsOff in case the current state is
in-progress.
>> /*
>> * TODO: how to really handle the worker still running when the
>> * launcher exits?
>> */
>> if (DataChecksumsWorkerShmem->worker_running)
>> ereport(LOG,
>> errmsg("data checksums launcher exiting while worker is still running"));
>
> That TODO should be addressed somehow.
Since processing cannot reach a new state if the launcher goes away, I resorted
to signalling the worker to die in this case.
>> + /*
>> + * As of now we only update the block counter for main forks in order
>> + * to not cause too frequent calls. TODO: investigate whether we
>> + * should do it more frequent?
>> + */
>> + if (forkNum == MAIN_FORKNUM)
>> + pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
>> + (blknum + 1));
>
> We're updating it for every block in the main fork, but not at all for other forks? What a bizarre way of avoiding too frequent calls :-). I think you could just call this on every page, pgstat_progress_update_param() is supposed to be very fast. For comparison, e.g. index build calls it for every tuple.
Ok, done.
>> +static const ChecksumBarrierCondition checksum_barriers[4] =
>> +{
>> + /*
>> + * When disabling checksums, either inprogress state is Ok but checksums
>> + * must not be in the enabled state.
>> + */
>> + {
>> + .target = PG_DATA_CHECKSUM_OFF,
>> + .barrier_eq = {PG_DATA_CHECKSUM_INPROGRESS_ON, PG_DATA_CHECKSUM_INPROGRESS_OFF},
>> + .barrier_eq_sz = 2,
<..snip..>
>> + .target = PG_DATA_CHECKSUM_INPROGRESS_OFF,
>> + .barrier_eq = {PG_DATA_CHECKSUM_VERSION},
>> + .barrier_eq_sz = 1,
>> + {0}, 0
>> + },
>> +};
>
> I find this to still be a pretty complicated and unclear way of representing the allowed transitions. There are only 16 possible transitions, and only 6 of them are allowed. How about listing the allowed ones directly:
Fair enough, I've replaced it with a simpler to/from struct and the for loop
then checks for a struct member which contains the current as .from and target
as .to.
>> +/*
>> + * Signaling between backends calling pg_enable/disable_data_checksums, the
>> + * checksums launcher process, and the checksums worker process.
>> + *
>> + * This struct is protected by DataChecksumsWorkerLock
>> + */
>> +typedef struct DataChecksumsWorkerShmemStruct
>> +{
>> + /*
>> + * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
>> + * launcher what the target state is.
>> + */
>> + DataChecksumsWorkerOperation launch_operation;
>> + int launch_cost_delay;
>> + int launch_cost_limit;
>
> The naming feels a little weird with this struct. It's called "DataChecksumsWorkerShmemStruct", but it's also accessed by the backends calling pg_enable/disable_data_checksums(). And "DataChecksumsWorkerOperation" is not accessed by workers at all. Or I guess the "operation" global variable is used in DataChecksumsWorkerMain(), but it's always set to ENABLE_DATACHECKSUMS in the worker. Do you need the "operation" global variable at all?
I admittedly hadn't realized but you are quite right. I renamed the structure
to DataChecksumsState instead which seems to fit better. The operation
variable is used to check against the launch_operation such that a call to
disable checksums while an enabling is running will abort processing
gracefully.
>> + if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON)
>> + {
>> + SpinLockRelease(&XLogCtl->info_lck);
>> + elog(PANIC, "checksums not in \"inprogress-on\" mode");
>> + }
>> +
>> + SpinLockRelease(&XLogCtl->info_lck);
>
> The PANIC seems a little harsh, you haven't done anything destructive here. It's unexpected for this to be called in any other state, so this is a "can't happen" scenario, but I don't think we usually PANIC on those.
That's a good point, I've replaced with a WARNING and a SetDataChecksumsOff()
call to move to a safe state.
>> + <para>
>> + If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
>> + specified, the process is throttled using the same principles as
>> + <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
>> + </para>
>
> Ugh, yet another place where we expose the "cost delay/limit" throttling mechanism. I agree it's good to be consistent here and use the same method we use for vacuum, I just wish we had something more user-friendly..
Agreed on both counts.
> Grammar / spelling:
>
>> + * state will also be set of "off".
>
>> + * When moving to inprogress-on the current state cannot enabled, but when
>
>> + * If a worker process currently running? This is set by the worker
>
>> + * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
>
> there is no "pg_verify_data_checksums" function.
Fixed.
> "calcuated" in commit message
Fixed.
Thank you so much the continued reviewing support. The attached rebase
contains the above fixes as well a few small cleanups detected by Github
CoPilot review. This version removes more than it adds, a trend which has been
going on for quite some versions, and the patch is now just north of 5000 loc
compared to the 5300 it was not long ago. The changes from the previous are
attached as a text file as well for easier teasing apart what was new.
--
Daniel Gustafsson
| Attachment | Content-Type | Size |
|---|---|---|
| v20260402_2-0001-Online-enabling-and-disabling-of-data-ch.patch | application/octet-stream | 223.6 KB |
| v20260402_2_diff.txt | text/plain | 25.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavel Stehule | 2026-04-02 18:57:41 | Re: WIP - xmlvalidate implementation from TODO list |
| Previous Message | Tom Lane | 2026-04-02 18:47:52 | Re: pg_waldump: support decoding of WAL inside tarfile |