From: | Priya V <mailme0216(at)gmail(dot)com> |
---|---|
To: | Frits Hoogland <frits(dot)hoogland(at)gmail(dot)com> |
Cc: | Joe Conway <mail(at)joeconway(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Date: | 2025-08-06 15:51:29 |
Message-ID: | CAFsZ43xF00OHNg6d8iD0zd0OCpWe8+r9p4A+HAupo0jxOdYgpQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi Frits, Joe,
Thank you both for you insights
*Current situation:*
*cat /proc/sys/vm/overcommit_memory*
0
*cat /proc/sys/vm/overcommit_ratio*
50
$ *cat /proc/sys/vm/swappiness*
60
*Workload*: Multi-tenant PostgreSQL
*uname -r*
4.18.0-477.83.1.el8_8.x86_64
*free -h*
total used free shared buff/cache available
Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi
Swap: 0B 0B 0B
if we set overcommit_memory = 2, what should we set the overcommit_ration
value to ? Can you pls suggest ?
Is there a rule of thumb to go with ?
*Our goal is to not run into OOM issues, no memory wastage and also not
starve kernel ? *
Thanks!
On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits(dot)hoogland(at)gmail(dot)com>
wrote:
> Joe,
>
> Can you name any technical reason why not having swap for a database is an
> actual bad idea?
>
> Memory always is limited. Swap was invented to overcome a situation where
> the (incidental) memory usage of paged in memory was could (regularly) get
> higher than physical memory would allow, and thus have the (clear)
> workaround of having swap to 'cushion' the memory shortage issue by
> allowing a "second level" memory storage on disk.
> Still, this does not making memory unlimited. Swap extends the physical
> memory available with the amount of swap. There still is a situation where
> you can run out of memory when swap is added, simply by paging in more
> memory than physical memory and swap.
>
> Today, most systems are not memory constrained anymore, or: it is possible
> to get a server with enough physical memory to hold your common needed
> total memory need.
> And given the latency sensitive nature of databases in general, which
> includes postgres, for any serious deployment you should get a server with
> enough memory to host your workload, and configure postgres not to overload
> the memory.
>
> If you do oversubscribe on (physical) memory, you will get pain somewhere,
> there is no way around that.
> The article in defense of swap in essence is saying that if you happen to
> oversubscribe on memory, sharing the pain between anonymous and file is
> better.
> I would say you are already in a bad place if that happens, which is
> especially bad for databases, and databases should allow you to make memory
> usage predictable.
>
> However, what I found is that with 4+ kernels (4.18 to be precise; rhel
> 8), the kernel can try to favour file pages in certain situations making
> anonymous memory getting paged out even if swappiness is set to 1 or 0, and
> if there is a wealth of inactive file memory. It seems to have to do with
> workingset protection(?) mechanisms, but given the lack of clear statistics
> I can't be sure about that. What it does lead to in my situations is a
> constant rate of swapping in and out in certain situations, whilst there is
> no technical reason for linux to do so because there is enough available
> memory.
>
> My point of view has been that vm.overcommit_memory set to 2 was the way
> to go, because that allows linux to limit based on a set limit on
> allocation time, which guarantees way to make the database never run out of
> memory.
> it does guarantees linux to never run out of memory, absolutely.
> However, this limit is hard, and is applied for the process at both
> usermode and system mode (kernel level), and thus can enforce not providing
> memory at times where it's not safe to do so, and thus corrupt execution. I
> have to be honest, I have not seen this myself, but trustworthy sources
> have reported this repeatedly, which I am inclined to believe. This means
> postgres execution can corrupt/terminate in unlucky situations, which is
> impacts availability.
>
>
> *Frits Hoogland*
>
>
>
>
> On 5 Aug 2025, at 20:52, Joe Conway <mail(at)joeconway(dot)com> wrote:
>
> On 8/5/25 13:01, Priya V wrote:
>
> *Environment:*
> *PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress
> to be at 15.12 currently both are actively in use)
>
>
> PostgreSQL 13 end of life after November 13, 2025
>
> *OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range
>
>
> RHEL 7 has been EOL for quite a while now. Note that you have to watch out
> for collation issues/corrupted indexes after OS upgrades due to collations
> changing with newer glibc versions.
>
> *Swap:* Currently none
>
>
> bad idea
>
> *Workload:* Highly mixed — OLTP-style internal apps with
> unpredictable query patterns and connection counts
> *Goal:* Uniform, safe memory settings across the fleet to avoid
> kernel or database instability
>
>
> We’re considering:
> *|vm.overcommit_memory = 2|* for strict accounting
>
>
> yes
>
> Increasing |vm.overcommit_ratio| from 50 → 80 or 90 to better
> reflect actual PostgreSQL usage (e.g., |work_mem| reservations that
> aren’t fully used)
>
>
> work_mem does not reserve memory -- it is a maximum that might be used in
> memory for a particular operation
>
> *Our questions for those running large PostgreSQL fleets:*
> 1.
> What |overcommit_ratio| do you find safe for PostgreSQL without
> causing kernel memory crunches?
>
>
> Read this:
>
> https://www.cybertec-postgresql.com/en/what-you-should-know-about-linux-memory-overcommit-in-postgresql/
>
> 2.
> Do you prefer |overcommit_memory = 1| or |= 2| for production stability?
>
>
> Use overcommit_memory = 2 for production stability
>
> 3.
> How much swap (if any) do you keep in large-memory servers where
> PostgreSQL is the primary workload? Is having swap configured a good
> idea or not ?
>
>
> You don't necessary need a large amount of swap, but you definitely should
> not disable it.
>
> Some background on that:
> https://chrisdown.name/2018/01/02/in-defence-of-swap.html
>
> 4.
> Any real-world cases where kernel accounting was too strict or too
> loose for PostgreSQL?
>
>
> In my experience the biggest issues are when postgres is running in a
> memory constrained cgroup. If you want to constrain memory with cgroups,
> use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max.
>
> 5. What settings to go with if we are not planning on using swap ?
>
>
> IMHO do not disable swap on Linux, at least not on production, ever.
>
> We’d like to avoid both extremes:
> Too low a ratio → PostgreSQL backends failing allocations even with
> free RAM
>
>
> Have you actually seen this or are you theorizing?
>
> Too high a ratio → OOM killer terminating PostgreSQL under load spikes
>
>
> If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100,
> maybe 80 or so as you suggested), and swap is not disabled, and you are not
> running in a memory constrained cgroup, I would be very surprised if you
> will ever get hit by the OOM killer. And if you do, things are so bad the
> database was probably dying anyway.
>
> HTH,
>
> --
> Joe Conway
> PostgreSQL Contributors Team
> Amazon Web Services: https://aws.amazon.com
>
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2025-08-06 16:33:49 | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Previous Message | Frits Hoogland | 2025-08-06 08:47:29 | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |