Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Jelte Fennema <postgres(at)jeltef(dot)nl>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, duspensky(at)ya(dot)ru, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption
Date: 2021-03-16 18:44:55
Message-ID: 20210316184455.GB20766@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Greetings,

* Jelte Fennema (postgres(at)jeltef(dot)nl) wrote:
> We ran into this memory leak on PG11 in production. The lea was determined
> to be the root cause of OOM errors we were seeing. There was a combination
> of a things that caused this leak to become serious enough for these OOM
> errors to happen:

OOMs errors should be gracefully handled and PG should continue to
function. Was that not the case..?

> To clarify the context a bit more if you're not familiar with the details
> of vm.overcommit_memory: There's "used" memory and "commited_as" memory.
> The copy-on-write memory in all backends is counted towards "commited_as"
> memory. "used" memory does not increase for every backend, because it's
> copy-on-write and none of the backends write to this memory (since it's
> leaked so there's no live pointer to it).

Right- and is also why it's certainly important to be monitoring the
committed_as value vs the commit limit.

> Linux puts a hard limit on commited_as, because we use
> vm.overcommit_memory=2 (which means memory overcommitting is disabled). If
> we had memory overcommiting enabled, then this memory leak wouldn't be a
> real problem. The amount of "used" memory is pretty much negligable. It
> only becomes a problem, because it's commited_as is multiplied for every
> process and we care about commited_as because of disabled overcommiting.

Allowing overcommit, on the other hand, ends up with the Linux OOM
Killer running and sending essentially a kill -9 to PG, causing the
entire PG instance to crash and have to go through recovery.

> It would be great if this could be backpatched to all currently supported
> PG versions. The patch is very small, so it should be very little effort I
> think. I'd be happy to help with that if that's useful or needed.

+1 on back-patching these fixes. -1 on what came across, to me at
least, as an argument for allowing overcommit. I realize you didn't
explicitly say that, but figured it'd be good for the archives to
discuss a bit more about why having overcommit_memory set to 2 is
strongly recommended. Without that, runaway queries could lead to the
OOM Killer running and the entire PG instance crashing.

Thanks,

Stephen

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jelte Fennema 2021-03-16 18:56:09 Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption
Previous Message Andres Freund 2021-03-16 18:12:26 Re: BUG #16160: Minor memory leak in case of starting postgres server with SSL encryption