From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Priya V <mailme0216(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Date: | 2025-08-05 18:52:25 |
Message-ID: | 3633df47-5693-4871-9e12-bfbb68dd3313@joeconway.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 8/5/25 13:01, Priya V wrote:
> *Environment:*
> *PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress
> to be at 15.12 currently both are actively in use)
PostgreSQL 13 end of life after November 13, 2025
> *OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
RHEL 7 has been EOL for quite a while now. Note that you have to watch
out for collation issues/corrupted indexes after OS upgrades due to
collations changing with newer glibc versions.
> *Swap:* Currently none
bad idea
> *Workload:* Highly mixed — OLTP-style internal apps with
> unpredictable query patterns and connection counts
>
> *Goal:* Uniform, safe memory settings across the fleet to avoid
> kernel or database instability
> We’re considering:
> *|vm.overcommit_memory = 2|* for strict accounting
yes
> Increasing |vm.overcommit_ratio| from 50 → 80 or 90 to better
> reflect actual PostgreSQL usage (e.g., |work_mem| reservations that
> aren’t fully used)
work_mem does not reserve memory -- it is a maximum that might be used
in memory for a particular operation
> *Our questions for those running large PostgreSQL fleets:*
> 1.
> What |overcommit_ratio| do you find safe for PostgreSQL without
> causing kernel memory crunches?
> 2.
> Do you prefer |overcommit_memory = 1| or |= 2| for production stability?
Use overcommit_memory = 2 for production stability
> 3.
> How much swap (if any) do you keep in large-memory servers where
> PostgreSQL is the primary workload? Is having swap configured a good
> idea or not ?
You don't necessary need a large amount of swap, but you definitely
should not disable it.
Some background on that:
https://chrisdown.name/2018/01/02/in-defence-of-swap.html
> 4.
> Any real-world cases where kernel accounting was too strict or too
> loose for PostgreSQL?
In my experience the biggest issues are when postgres is running in a
memory constrained cgroup. If you want to constrain memory with cgroups,
use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max.
> 5. What settings to go with if we are not planning on using swap ?
IMHO do not disable swap on Linux, at least not on production, ever.
> We’d like to avoid both extremes:
> Too low a ratio → PostgreSQL backends failing allocations even with
> free RAM
Have you actually seen this or are you theorizing?
> Too high a ratio → OOM killer terminating PostgreSQL under load spikes
If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100,
maybe 80 or so as you suggested), and swap is not disabled, and you are
not running in a memory constrained cgroup, I would be very surprised if
you will ever get hit by the OOM killer. And if you do, things are so
bad the database was probably dying anyway.
HTH,
--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Frits Hoogland | 2025-08-06 08:47:29 | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Previous Message | Priya V | 2025-08-05 17:01:19 | Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |