pgsql: Fix 'unexpected data beyond EOF' on replica restart

From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix 'unexpected data beyond EOF' on replica restart
Date: 2026-01-15 19:11:51
Message-ID: E1vgSl4-000fEI-2P@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix 'unexpected data beyond EOF' on replica restart

On restart, a replica can fail with an error like 'unexpected data
beyond EOF in block 200 of relation T/D/R'. These are the steps to
reproduce it:

- A relation has a size of 400 blocks.
- Blocks 201 to 400 are empty.
- Block 200 has two rows.
- Blocks 100 to 199 are empty.
- A restartpoint is done
- Vacuum truncates the relation to 200 blocks
- A FPW deletes a row in block 200
- A checkpoint is done
- A FPW deletes the last row in block 200
- Vacuum truncates the relation to 100 blocks
- The replica restarts

When the replica restarts:

- The relation on disk starts at 100 blocks, because all the
truncations were applied before restart.
- The first truncate to 200 blocks is replayed. It silently fails, but
it will still (incorrectly!) update the cache size to 200 blocks
- The first FPW on block 200 is applied. XLogReadBufferForRead relies
on the cached size and incorrectly assumes that the page already
exists in the file, and thus won't extend the relation.
- The online checkpoint record is replayed, calling smgrdestroyall
which causes the cached size to be discarded
- The second FPW on block 200 is applied. This time, the detected size
is 100 blocks, an extend is attempted. However, the block 200 is
already present in the buffer cache due to the first FPW. This
triggers the 'unexpected data beyond EOF'.

To fix, update the cached size in SmgrRelation with the current size
rather than the requested new size, when the requested new size is
greater.

Author: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Discussion: https://www.postgresql.org/message-id/CAO6_Xqrv-snNJNhbj1KjQmWiWHX3nYGDgAc=vxaZP3qc4g1Siw@mail.gmail.com
Backpatch-through: 14

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/23b25586dccfe81b6ac46ba8b4e505d48192732b

Modified Files
--------------
src/backend/storage/smgr/md.c | 3 +++
src/backend/storage/smgr/smgr.c | 11 ++++++++++-
2 files changed, 13 insertions(+), 1 deletion(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2026-01-15 19:12:26 pgsql: Optimize LISTEN/NOTIFY via shared channel map and direct advance
Previous Message Álvaro Herrera 2026-01-15 18:20:45 pgsql: Remove #include <math.h> where not needed