Re: Changing the state of data checksums in a running cluster

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Bernd Helmle <mailings(at)oopsware(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Changing the state of data checksums in a running cluster
Date: 2026-05-05 07:43:03
Message-ID: CAHg+QDeevH6aTyWdXYBJW0wOmfoZy66gDi5TfinK_dXeCrHQLg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Hackers, Daniel,

Further testing this feature, I noticed that the cost_delay and cost_limit
arguments
to pg_enable_data_checksums() in practice have no effect.

It appears we have two independent issues in DataChecksumsWorkerMain():

(1) The worker writes the user-supplied values to VacuumCostDelay and
VacuumCostLimit (the GUC-bound globals in globals.c). However,
vacuum_delay_point() reads vacuum_cost_delay / vacuum_cost_limit
declared in vacuum.c. The two pairs are kept in sync only by
VacuumUpdateCosts(), which the worker never calls. Therefore, the napping
formula always sees the defaults (vacuum_cost_delay = 0) and never
sleeps.

(2) The worker also resets VacuumCostPageHit/Miss/Dirty to 0 at startup.
With all per-page weights at zero, VacuumCostBalance never reaches
vacuum_cost_limit, which would defeat the throttling on its own even
if (1) were fixed.

Repro:

Create a database and load data (say 3 GB)

SELECT pg_disable_data_checksums();
SELECT pg_enable_data_checksums(100, 1); -- 100 ms/page, balance limit 1

Without the fix, this completes in ~10 seconds and pg_stat_activity
never shows wait_event = VacuumDelay for the worker. With even moderate
parameters (e.g. (50, 200)) the worker is continuously in VacuumDelay
after the patch, and total runtime stretches as one would expect.
Also manually tested with cost_delay 0 and higher cost limits.

Attached a patch to fix this.

Thanks,
Satya

Attachment Content-Type Size
0001-Apply-data-checksum-worker-throttling-parameters.patch application/octet-stream 2.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2026-05-05 07:43:21 Re: [PATCH] Clean up property graph error messages
Previous Message Daniel Gustafsson 2026-05-05 07:40:19 Re: Serverside SNI support in libpq