| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
| Cc: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Bernd Helmle <mailings(at)oopsware(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net> |
| Subject: | Re: Changing the state of data checksums in a running cluster |
| Date: | 2026-05-28 11:51:14 |
| Message-ID: | 538e820b-db2a-4f53-ba24-c354c72fc1a9@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 5/28/26 13:28, Daniel Gustafsson wrote:
>> On 26 May 2026, at 20:12, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
>> I suppose this means we should not be updating the checksum state
>> without emitting the barrier? I think all other places do that.
>
> Good catch, it's indeed a bug, any state change must emit a procsignalbarrier
> to maintain cluster consistency. I ended up writing a test for this very case
> as well.
>
Good.
>> I'm still not sure if it really is an issue or just an annoyance,
>> because I've not been able to find a case where it'd lead to checksum
>> failures (or obviously incorrect final state after recovery).
>
> I've tried to get it to reach an incorrect end state but failed, but I do agree
> that maybe we need an improved locking protocol around state updates. Need to
> spend some more time thinking about this.
>
OK
>> I still don't understand why this needs DELAY_CHKPT_START ...
>
> Having stared at this for some time, and going over old threads, I think this
> is a mistake. AFAICT though it cannot cause any error, so I'd lean towards
> erring on the safe side by leaving as is and looking at removing in 20. What
> do you think?
>
I'd probably try to fix this for 19, otherwise it may be confusing
people looking at the code in the future. We're still months from 19
getting released. Ofc, maybe I'm underestimating the risk.
>> I also noticed a couple minor comment issues, per attached patch (this
>> may need pgindent).
>
> I ended up splitting this into two, one for the comment fixes and one for the
> data type change.
>
> I propose applying the three patches below to v19 to fix the promotion issue
> before we wrap beta1.
>
WFM
> --
> Daniel Gustafsson
>
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-05-28 12:03:14 | Re: postgres_fdw, dblink: Validate use_scram_passthrough values |
| Previous Message | Peter Eisentraut | 2026-05-28 11:49:10 | Re: Heads Up: cirrus-ci is shutting down June 1st |