Auto-tune shared_buffers to use available huge pages

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Auto-tune shared_buffers to use available huge pages
Date: 2026-01-23 15:36:51
Message-ID: CAO6_Xqq6w5hTY_W+gJWp29t15NRtNLSTzD6khDC=Xy2P0BWPTQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Under a normal environment, the instance's number of huge pages can be
adjusted to the size reported by shared_memory_size_in_huge_pages,
then Postgres can be started and the requested shared memory fit in
the available huge pages.

A similar approach is harder to implement with environments like
kubernetes. If I want to modify the huge pages on a pod, I need to:
- Modify the host's huge pages
- Restart the host's kubelet so it detects the new amount of huge pages
- Modify the pod's huge page request

Most of those steps are far from practical. An alternative would be to
have a fixed number of huge pages (like 25% of the node's memory), and
to adjust the configuration, like the amount of shared_buffers.
However, adjusting the configuration to fit in a fixed amount of
memory is tricky:
- shared_buffers is used to auto-tune multiple parameters so there's
no easy formula to get the correct amount. The only way I've found is
to basically increase shared_buffers until
shared_memory_size_in_huge_pages matches the desired amount of huge
pages
- changing other parameters like max_connections mean shared_buffers
has to be adjusted again

To help with that, the attached patch provides a new option,
huge_pages_autotune_buffers, to automatically use leftover huge pages
as shared_buffers. This requires some changes in the auto-tune logic:
- Subsystems that are using shared_buffers for auto-tuning will rely
on the configured shared_buffers, not the auto-tuned shared_buffers
and they should save the auto-tuned value in a GUC. This will be done
in dedicated auto-tune functions.
- Once the auto-tune functions are called, modifying NBuffers won't
change the requested memory except for the shared buffer pool in
BufferManagerShmemSize
- We can get the leftover memory (free huge pages - requested memory),
and estimate how much shared_buffers we can add
- Increasing shared_buffers will also increase the freelist hashmap,
so the auto-tuned shared_buffers needs to be reduced

The patch is split in the following sub-patches:

0001: Extract the current auto-tune logic in dedicated functions,
making the behaviour more consistent across subsystems.

0002: The checkpointer auto-tunes the request size using NBuffers, but
doesn't save the result in a GUC. This adds a new
checkpoint_request_size GUC with the same auto-tune logic.

0003: Extract HugePages_Free value when /proc/meminfo is parsed in
GetHugePageSize.

0004: Pass NBuffers as parameters to StrategyShmemSize. This is
necessary to get how much memory will be used by the freelist using
'StrategyShmemSize(candidate_nbuffers) - StrategyShmemSize(NBuffers)'.

0005: Add BufferManagerAutotune to auto-tune the amount of shared_buffers.

Regards,
Anthonin Bonnefoy

Attachment Content-Type Size
v1-0003-Extract-HugePages_Free-value-in-GetHugePageSize.patch application/octet-stream 5.0 KB
v1-0004-Pass-NBuffers-as-parameter-to-StrategyShmemSize.patch application/octet-stream 2.6 KB
v1-0005-Auto-tune-shared_buffers-to-use-available-huge-pa.patch application/octet-stream 5.7 KB
v1-0002-Add-GUC-for-checkpointer-request-queue-size.patch application/octet-stream 6.4 KB
v1-0001-Create-dedicated-shmem-Autotune-functions.patch application/octet-stream 17.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2026-01-23 15:43:19 Re: Import Statistics in postgres_fdw before resorting to sampling.
Previous Message David Geier 2026-01-23 15:10:22 Re: Use correct collation in pg_trgm