Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption

From: Jelte Fennema <postgres(at)jeltef(dot)nl>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, duspensky(at)ya(dot)ru, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption
Date: 2021-03-16 18:56:09
Message-ID: CAGECzQTEsABt5nMGff5mMakN-Nbex-gFhw4zvd5Ng3CgyPySyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

>
> OOMs errors should be gracefully handled and PG should continue to
> function. Was that not the case..?
>

Yes, they were gracefully handled. I also didn't mean to suggest that
disabling overcommit would be the right solution. We definitely don't want
to do that. I mainly added that to the email to make clear why a few MBs
expanded to effectively a few GBs.

On Tue, 16 Mar 2021 at 19:44, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings,
>
> * Jelte Fennema (postgres(at)jeltef(dot)nl) wrote:
> > We ran into this memory leak on PG11 in production. The lea was
> determined
> > to be the root cause of OOM errors we were seeing. There was a
> combination
> > of a things that caused this leak to become serious enough for these OOM
> > errors to happen:
>
> OOMs errors should be gracefully handled and PG should continue to
> function. Was that not the case..?
>
> > To clarify the context a bit more if you're not familiar with the details
> > of vm.overcommit_memory: There's "used" memory and "commited_as" memory.
> > The copy-on-write memory in all backends is counted towards "commited_as"
> > memory. "used" memory does not increase for every backend, because it's
> > copy-on-write and none of the backends write to this memory (since it's
> > leaked so there's no live pointer to it).
>
> Right- and is also why it's certainly important to be monitoring the
> committed_as value vs the commit limit.
>
> > Linux puts a hard limit on commited_as, because we use
> > vm.overcommit_memory=2 (which means memory overcommitting is disabled).
> If
> > we had memory overcommiting enabled, then this memory leak wouldn't be a
> > real problem. The amount of "used" memory is pretty much negligable. It
> > only becomes a problem, because it's commited_as is multiplied for every
> > process and we care about commited_as because of disabled overcommiting.
>
> Allowing overcommit, on the other hand, ends up with the Linux OOM
> Killer running and sending essentially a kill -9 to PG, causing the
> entire PG instance to crash and have to go through recovery.
>
> > It would be great if this could be backpatched to all currently supported
> > PG versions. The patch is very small, so it should be very little effort
> I
> > think. I'd be happy to help with that if that's useful or needed.
>
> +1 on back-patching these fixes. -1 on what came across, to me at
> least, as an argument for allowing overcommit. I realize you didn't
> explicitly say that, but figured it'd be good for the archives to
> discuss a bit more about why having overcommit_memory set to 2 is
> strongly recommended. Without that, runaway queries could lead to the
> OOM Killer running and the entire PG instance crashing.
>
> Thanks,
>
> Stephen
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Yannick Collette 2021-03-16 19:10:49 Re: BUG #16774: PostgreSQL clean build MINGW64 gcc but initdb fails, cannot find startadress CreateProcessAsUserA
Previous Message Stephen Frost 2021-03-16 18:44:55 Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption