Re: Two issues leading to discrepancies in FSM data on the standby server

From: Alexey Makhmutov <a(dot)makhmutov(at)postgrespro(dot)ru>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Two issues leading to discrepancies in FSM data on the standby server
Date: 2026-04-06 14:26:01
Message-ID: 60b79f39-69e5-4c73-a708-6ef1fd5e7980@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Andrey!

Thank you for the attention to this patch!

> Originally in e981653 was used MarkBufferDirty() but 96ef3b8 flipped to MarkBufferDirtyHint().
> Neither of these commits provided a comment on why this version was chosen. I think if we fix it we must comment things.

I think that reason of change in 96ef3b8 (changing of 'MarkBufferDirty'
to 'MarkBufferDirtyHint') may be described in the next commit (9df56f6),
during the README update:
> New WAL records cannot be written during recovery, so hint bits set
during recovery must not dirty the page if the buffer is not already
dirty, when checksums are enabled. Systems in Hot-Standby mode may
benefit from hint bits being set, but with checksums enabled, a page
cannot be dirtied after setting a hint bit (due to the torn page risk).
So, it must wait for full-page images containing the hint bit updates to
arrive from the master.

So, it seems logical, that any changes to the data not protected by the
WAL (which includes VM and FSM as well) should use MarkBufferDirtyHint,
which does not set dirty flag during recovery. However, as FSM blocks
could be just zeroed in case of checksums mismatch, so I think it's
perfectly fine to use regular MarkBufferDirty here.

I've updated the first patch by adding the comment with explanation of
the reason for using MarkBufferDirty instead of MarkBufferDirtyHint here.

As for the second issue and the patch - it seems to be resolved in the
current master by a881cc9, which removed the entire 'heap_xlog_visible'
method, as all-visibility information is now sent with the
XLOG_HEAP2_PRUNE_VACUUM_CLEANUP message and its handler already uses
PageGetHeapFreeSpace. The problem is still relevant for the pre-19
versions, so I will probably move it to the separate thread in bugs.

Thanks,
Alexey

Attachment Content-Type Size
0001-Mark-modified-FSM-buffer-as-dirty-during-recovery.patch text/x-patch 4.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2026-04-06 14:29:57 Re: PG 19 release notes and authors
Previous Message Peter Geoghegan 2026-04-06 14:14:33 Re: pg_plan_advice