| From: | Álvaro Herrera <alvherre(at)kurilemu(dot)de> |
|---|---|
| To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: race condition when writing pg_control |
| Date: | 2026-02-02 14:34:47 |
| Message-ID: | 202602021426.ztjk2wp6sgiy@alvherre.pgsql |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2024-May-18, Thomas Munro wrote:
> First idea idea I've come up with to avoid all of that: pass a copy of
> the "proto-controlfile", to coin a term for the one read early in
> postmaster startup by LocalProcessControlFile(). As far as I know,
> the only reason we need it is to suck some settings out of it that
> don't change while a cluster is running (mostly can't change after
> initdb, and checksums can only be {en,dis}abled while down). Right?
> Children can just "import" that sucker instead of calling
> LocalProcessControlFile() to figure out the size of WAL segments yada
> yada, I think? Later they will attach to the real one in shared
> memory for all future purposes, once normal interlocking is allowed.
>
> I dunno. Draft patch attached. Better plans welcome. This passes CI
> on Linux systems afflicted by EXEC_BACKEND, and Windows. Thoughts?
Has this problem been addressed? Looking at the known-buildfarm-
failures page,
https://wiki.postgresql.org/wiki/Known_Buildfarm_Test_Failures#culicidae_failed_to_restart_server_due_to_incorrect_checksum_in_control_file
there are still some failures of that ilk, last in 2026-01-21.
So, was this "proto-controlfile" idea discarded? I see Noah downthread
proposed something somewhat more sophisticated than this, setting some
values to garbage to prevent reading invalid values. I imagine that
would be on top of Thomas' patch, so I have rebased it and moved to the
next commitfest.
Thanks
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Crear es tan difícil como ser libre" (Elsa Triolet)
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Fix-pg_control-corruption-in-EXEC_BACKEND-startup.patch | text/x-diff | 6.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2026-02-02 14:38:09 | Fix pg_stat_get_backend_wait_event() for aux processes |
| Previous Message | Manni Wood | 2026-02-02 14:17:55 | Re: Speed up COPY FROM text/CSV parsing using SIMD |