Re: Don't keep closed WAL segment in page cache after replay

From: Andres Freund <andres(at)anarazel(dot)de>
To: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Cc: Japin Li <japinli(at)hotmail(dot)com>, Hüseyin Demir <huseyin(dot)d3r(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Don't keep closed WAL segment in page cache after replay
Date: 2026-03-04 15:55:58
Message-ID: i5urmkvtxfxsee7ra5o4oiolih2qq6lvzgspzjjd4suz5kfyin@enmxp53urnjb
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-03-04 08:38:24 +0100, Anthonin Bonnefoy wrote:
> From ad0a3cfe10bdd2cccc4274849c4a77898b06e13c Mon Sep 17 00:00:00 2001
> From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
> Date: Wed, 2 Jul 2025 09:58:52 +0200
> Subject: Don't keep closed WAL segments in page cache after replay
>
> On a standby, the recovery process reads the WAL segments, applies
> changes and closes the segment. When closed, the segments will still be
> in page cache memory until they are evicted due to inactivity. The
> segments may be re-read if archive_mode is set to always, wal_summarizer
> is enabled or if the standby is used for replication and has an active
> walsender.
>
> The presence of a replication slots is also a likely indicator that
> a walsender will be started, and need to read the WAL segments.
>
> Outside of those circumstances, the WAL segments won't be re-read and
> keeping them in the page cache generates unnecessary memory pressure.
> A POSIX_FADV_DONTNEED is sent before closing a replayed WAL segment to
> immediately free any cached pages.

I am quite sceptical that this is a good idea.

Have you actually measured benefits? I skimmed the thread and didn't see
anything. It's pretty cheap for the kernel to replace a clean page from the
page cache with different content.

If you [crash-]restart the replica this will make it way more expensive. If
you have twophase commits where we need to read 2PC details from the WAL, this
will make it more expensive. If somebody takes a base backup, this ...

I think you'd have to have pretty convincing benchmarks showing that this is a
good idea before we should even remotely consider applying this.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2026-03-04 16:00:37 Re: doc: add note that wal_level=logical doesn't set up logical replication in itself
Previous Message David G. Johnston 2026-03-04 15:55:13 doc: Improve wal_level and effective_wal_level GUC around logical replication