From: | Priya V <mailme0216(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Date: | 2025-08-05 17:01:19 |
Message-ID: | CAFsZ43xFxjSiONwRccXBQXZrPRd+Lh7XAkSVEG1ai165xPcoDA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello Postgres community,
We operate a large PostgreSQL fleet (~15,000 databases) on dedicated Linux
hosts.
Each host runs *multiple PostgreSQL instances* (multi-instance setup, not
just multiple DBs inside one instance).
*Environment:*
-
*PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress to
be at 15.12 currently both are actively in use)
-
*OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
-
*RAM:* 256 GiB (varies slightly)
-
*Swap:* Currently none
-
*Workload:* Highly mixed — OLTP-style internal apps with unpredictable
query patterns and connection counts
-
*Goal:* Uniform, safe memory settings across the fleet to avoid kernel
or database instability
We’re reviewing vm.overcommit_* settings because we’ve seen conflicting
guidance:
-
vm.overcommit_memory = 2 gives predictability but can reject allocations
early
-
vm.overcommit_memory = 1 is more flexible but risks OOM kills if many
backends hit peak memory usage at once
We’re considering:
-
*vm.overcommit_memory = 2* for strict accounting
-
Increasing vm.overcommit_ratio from 50 → 80 or 90 to better reflect
actual PostgreSQL usage (e.g., work_mem reservations that aren’t fully
used)
*Our questions for those running large PostgreSQL fleets:*
1.
What overcommit_ratio do you find safe for PostgreSQL without causing
kernel memory crunches?
2.
Do you prefer overcommit_memory = 1 or = 2 for production stability?
3.
How much swap (if any) do you keep in large-memory servers where
PostgreSQL is the primary workload? Is having swap configured a good idea
or not ?
4.
Any real-world cases where kernel accounting was too strict or too loose
for PostgreSQL?
5. What settings to go with if we are not planning on using swap ?
We’d like to avoid both extremes:
-
Too low a ratio → PostgreSQL backends failing allocations even with free
RAM
-
Too high a ratio → OOM killer terminating PostgreSQL under load spikes
Any operational experiences, tuning recommendations, or kernel/PG
interaction pitfalls would be very helpful.
TIA
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2025-08-05 18:52:25 | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Previous Message | Pavel Stehule | 2025-08-05 05:30:18 | Re: proposal: schema variables |