| From: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Auto-tune shared_buffers to use available huge pages |
| Date: | 2026-01-23 15:36:51 |
| Message-ID: | CAO6_Xqq6w5hTY_W+gJWp29t15NRtNLSTzD6khDC=Xy2P0BWPTQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
Under a normal environment, the instance's number of huge pages can be
adjusted to the size reported by shared_memory_size_in_huge_pages,
then Postgres can be started and the requested shared memory fit in
the available huge pages.
A similar approach is harder to implement with environments like
kubernetes. If I want to modify the huge pages on a pod, I need to:
- Modify the host's huge pages
- Restart the host's kubelet so it detects the new amount of huge pages
- Modify the pod's huge page request
Most of those steps are far from practical. An alternative would be to
have a fixed number of huge pages (like 25% of the node's memory), and
to adjust the configuration, like the amount of shared_buffers.
However, adjusting the configuration to fit in a fixed amount of
memory is tricky:
- shared_buffers is used to auto-tune multiple parameters so there's
no easy formula to get the correct amount. The only way I've found is
to basically increase shared_buffers until
shared_memory_size_in_huge_pages matches the desired amount of huge
pages
- changing other parameters like max_connections mean shared_buffers
has to be adjusted again
To help with that, the attached patch provides a new option,
huge_pages_autotune_buffers, to automatically use leftover huge pages
as shared_buffers. This requires some changes in the auto-tune logic:
- Subsystems that are using shared_buffers for auto-tuning will rely
on the configured shared_buffers, not the auto-tuned shared_buffers
and they should save the auto-tuned value in a GUC. This will be done
in dedicated auto-tune functions.
- Once the auto-tune functions are called, modifying NBuffers won't
change the requested memory except for the shared buffer pool in
BufferManagerShmemSize
- We can get the leftover memory (free huge pages - requested memory),
and estimate how much shared_buffers we can add
- Increasing shared_buffers will also increase the freelist hashmap,
so the auto-tuned shared_buffers needs to be reduced
The patch is split in the following sub-patches:
0001: Extract the current auto-tune logic in dedicated functions,
making the behaviour more consistent across subsystems.
0002: The checkpointer auto-tunes the request size using NBuffers, but
doesn't save the result in a GUC. This adds a new
checkpoint_request_size GUC with the same auto-tune logic.
0003: Extract HugePages_Free value when /proc/meminfo is parsed in
GetHugePageSize.
0004: Pass NBuffers as parameters to StrategyShmemSize. This is
necessary to get how much memory will be used by the freelist using
'StrategyShmemSize(candidate_nbuffers) - StrategyShmemSize(NBuffers)'.
0005: Add BufferManagerAutotune to auto-tune the amount of shared_buffers.
Regards,
Anthonin Bonnefoy
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0003-Extract-HugePages_Free-value-in-GetHugePageSize.patch | application/octet-stream | 5.0 KB |
| v1-0004-Pass-NBuffers-as-parameter-to-StrategyShmemSize.patch | application/octet-stream | 2.6 KB |
| v1-0005-Auto-tune-shared_buffers-to-use-available-huge-pa.patch | application/octet-stream | 5.7 KB |
| v1-0002-Add-GUC-for-checkpointer-request-queue-size.patch | application/octet-stream | 6.4 KB |
| v1-0001-Create-dedicated-shmem-Autotune-functions.patch | application/octet-stream | 17.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2026-01-23 15:43:19 | Re: Import Statistics in postgres_fdw before resorting to sampling. |
| Previous Message | David Geier | 2026-01-23 15:10:22 | Re: Use correct collation in pg_trgm |