From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Priya V <mailme0216(at)gmail(dot)com>, Frits Hoogland <frits(dot)hoogland(at)gmail(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Date: | 2025-08-06 16:33:49 |
Message-ID: | f35cdf25-411f-4cbc-a877-85569cf78061@joeconway.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
(Both: please trim and reply inline on these lists as I have done;
Frits, please reply all not just to the list -- I never received your
reply to me)
On 8/6/25 11:51, Priya V wrote:
> *cat /proc/sys/vm/overcommit_ratio*
> 50
> $ *cat /proc/sys/vm/swappiness*
> 60
>
> *Workload*: Multi-tenant PostgreSQL
>
> *uname -r*
> 4.18.0-477.83.1.el8_8.x86_64
IMHO you should strongly consider getting on a more recent distro with a
newer kernel.
> *free -h*
> total used free shared buff/cache available
> Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi
> Swap: 0B 0B 0B
As I said, do not disable swap. You don't need a huge amount, but maybe
16 GB or so would do it.
> if we set overcommit_memory = 2, what should we set the
> overcommit_ration value to ? Can you pls suggest ?
> Is there a rule of thumb to go with ?
There is no rule of thumb that I am aware of. Every workload is
different. Start with something like 80 and do your own testing to
refine that number.
> *Our goal is to not run into OOM issues, no memory wastage and also not
> starve kernel ? *
With overcommit_memory = 2, swap on (and reasonably sized), and
overcommit_ratio to something reasonable (certainly below 100), I think
you will have a difficult time getting an OOM kill even if you try
during testing. But you have to do your own testing for your workloads
-- there is no magic button here.
That is, unless you run postgres in a cgroup with memory.limit (cgroup
v1) or memory.max (cgroup v2) set. Note, running in containers with
memory limits set e.g. via Kubernetes will do that under the covers.
That is a completely different story.
> On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits(dot)hoogland(at)gmail(dot)com
> <mailto:frits(dot)hoogland(at)gmail(dot)com>> wrote:
> Can you name any technical reason why not having swap for a database
> is an actual bad idea?
Did you read the blog I linked? Do your own experiments.
* Swap is what is used when anonymous memory must be reclaimed to allow
for an allocation of anonymous memory.
* The Linux kernel will aggressively use all available memory for file
buffers, pushing usage against the limits.
* Especially in the older 4 series kernels, file buffers often cannot be
reclaimed fast enough
* With no swap and a large-ish anonymous memory request, it is easy to
push over the limit to cause the OOM killer to strike.
* On the other hand, with swap enabled anon memory can be reclaimed
giving the kernel more time to deal with file buffer reclamation.
At least that is what I have observed.
HTH,
--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Frits Hoogland | 2025-08-06 21:14:34 | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |
Previous Message | Priya V | 2025-08-06 15:51:29 | Re: Safe vm.overcommit_ratio for Large Multi-Instance PostgreSQL Fleet |