| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Subject: | freespace buffer locking |
| Date: | 2025-12-01 22:41:01 |
| Message-ID: | 4wggb7purufpto6x35fd2kwhasehnzfdy3zdcu47qryubs2hdz@fa5kannykekr |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
In the context of [1] I started looking at the freespace code, in particular
fsm_vacuum_page(), which has this comment:
/*
* Reset the next slot pointer. This encourages the use of low-numbered
* pages, increasing the chances that a later vacuum can truncate the
* relation. We don't bother with a lock here, nor with marking the page
* dirty if it wasn't already, since this is just a hint.
*/
I.e. we modify the buffer without even holding a share lock on the page. That
seems ... not ok.
What if, e.g., the page were included in a WAL record? Then this would corrupt
the record checksum. Now, this normally won't happen, was the FSM isn't WAL
logged, but still. And I think there may be special circumstances where the
page is included in a WAL record, e.g. as part of an CREATE DATABASE. And
there's FreeSpaceMapPrepareTruncateRel() - which hopefully can't run
concurrently with fsm_vacuum_page(), but would seem to court WAL corruption,
if it ever did.
Besides modifying the page while not even share locked, there are a few other
oddities:
There seem to be some other oddities:
- GetRecordedFreeSpace() does fsm_get_avail() without locking
- fsm_vacuum_page() does fsm_get_avail(), fsm_get_max_avail() without locking
ISTM we clearly should take a lock in fsm_vacuum_page() to reset fp_next_slot,
that just seems like a nasty hard to find bug waiting to happen. Changing it
to not look at the page without a lock seems a bit more challenging.
I suspect the omission of the lock in GetRecordedFreeSpace() in 15c121b3ed7e
wasn't intentional? Heikki, you probably don't remember? :). I think we
should fix that - none of the callers look like they'd be anywhere near
frequent enough in real workloads to make that a problem?
Greetings,
Andres Freund
[1] https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7%40az2pljabhnff
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2025-12-01 22:43:37 | Re: split tablecmds.c |
| Previous Message | Myles Lewis | 2025-12-01 22:38:43 | Re: [PATCH] contrib: Add pg_datemath extension with datediff function |