From 0148552645fb463713eb26604930ae4c65385c99 Mon Sep 17 00:00:00 2001 From: Adam Lee Date: Tue, 9 Jun 2026 16:22:35 +0800 Subject: [PATCH] Avoid orphaning buffers when a relation's file is missing When relations are dropped, DropRelationsAllBuffers() avoids scanning the whole buffer pool if it can read the size of every fork from the cache, locating the buffers to invalidate directly. When a fork's size is not cached it calls smgrexists(), and if the fork's file does not exist it skips the fork, treating it as having no buffers. In a disaster-recovery situation, though, a relation's data file can be missing on disk while a dirty buffer for it is still resident. Skipping the fork then leaves that buffer orphaned: the relation is dropped, but the buffer remains in shared buffers and every later checkpoint fails trying to write it back to the missing file ("could not open file ... while writing block N of relation ..."), so the server can no longer checkpoint. Before the targeted-drop optimization (added in v14, commit bea449c635c) the buffer pool was always scanned in full, so the buffer was invalidated whether or not its file still existed, and dropping the relation cleaned it up. Restore that behavior for the main fork: when its size is uncached and its file is missing, fall back to the full scan, which invalidates the relation's buffers across all forks. The main fork is the only fork a relation with storage always has; the fsm, vm and init forks are routinely absent on healthy relations (small tables have no fsm/vm; permanent relations have no init fork), so triggering a full scan whenever any fork is absent would disable the optimization for nearly every drop. --- src/backend/storage/buffer/bufmgr.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index cc398db124d..1f1fece4a3a 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -4951,7 +4951,32 @@ DropRelationsAllBuffers(SMgrRelation *smgr_reln, int nlocators) if (block[i][j] == InvalidBlockNumber) { if (!smgrexists(rels[i], j)) + { + /* + * In a disaster-recovery situation a relation's data file + * may be missing on disk while a dirty buffer for the fork + * is still resident. Skipping the fork (because it has no + * file) would leave that buffer orphaned, after which the + * checkpointer fails on every run trying to write it to the + * missing file, so the server can no longer checkpoint. + * Fall back to the full buffer-pool scan, which invalidates + * the relation's buffers across all forks regardless of the + * missing file, as was done unconditionally before this + * optimization, so dropping the relation can still clean it + * up. The main fork is the sentinel: it is the only fork a + * relation with storage always has, whereas the fsm, vm and + * init forks are routinely absent on healthy relations + * (small tables have no fsm/vm; permanent relations have no + * init fork), so triggering on their absence would force a + * full scan on nearly every drop. + */ + if (j == MAIN_FORKNUM) + { + cached = false; + break; + } continue; + } cached = false; break; } -- 2.52.0