Re: Limiting memory allocation

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Oleksii Kliukin <alexk(at)hintbits(dot)com>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Jan Wieck <jan(at)wi3ck(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Limiting memory allocation
Date: 2022-05-20 19:50:52
Message-ID: 20220520195051.GK9030@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Oleksii Kliukin (alexk(at)hintbits(dot)com) wrote:
> > On 18. May 2022, at 17:11, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > On 2022-May-18, Jan Wieck wrote:
> >> Maybe I'm missing something, but what is it that you would actually consider
> >> a solution? Knowing your current memory consumption doesn't make the need
> >> for allocating some right now go away. What do you envision the response of
> >> PostgreSQL to be if we had that information about resource pressure?
> >
> > What they (Timescale) do, is have a LD_PRELOAD library that checks
> > status of memory pressure, and return NULL from malloc(). This then
> > leads to clean abort of transactions and all is well. There's nothing
> > that Postgres needs to do different than today.

I'm not super fixated on exactly what this one implementation does, but
rather that the kernel is evidently not interested in trying to solve
this problem and therefore it's something which we need to address. I
agree in general that we don't need to do much different except to have
a way to effectively have a limit where we treat an allocation attempt
as failing and then the rest of our existing machinery will handle
failing the transaction and doing cleanup and such just fine.

> Correct. The library we have reads a limit supplied in an environment variable
> and stores per-process and total memory usage values in shared memory counters,
> updated after each call to malloc/free/calloc/realloc by the process making the
> call. When updating totals, a process picks one of N counters to update
> atomically with the difference between its old and new memory usage, avoiding
> congested ones; those are summed to determine current allocations for all
> processes and to compare against the limit.

Would be interesting to know just how many of these counters are used
and how 'congested' ones are avoided. Though, would certainly be easier
if one could simply review this library.

> > I suppose that what they would like, is a way to inquire into the memory
> > pressure status at MemoryContextAlloc() time and return NULL if it is
> > too high.

Not really concerned with what one specific implementation that's been
done would like but rather with solving the larger issue that exists,
which is that we aren't able to cap our memory usage today and that can
lead to the OOM killer coming into play, or excessive swap usage, or
causing issue for other processes running. While I started this with
the crash case as the main concern, and I do feel it's still a big case
to consider, there are other valuable use-cases to consider where this
would help.

> If we call user code just before malloc (and, presumably free and realloc), the
> code would have to do just as much work as when it is called from the
> malloc/free/realloc wrappers inside a preloaded library. Furthermore, I don’t
> see why the user would want to customize that logic: a single Linux-specific
> implementation would solve the problem for everyone.

If the problem is explicitly defined as "deal with the Linux OOM killer"
then, yes, a Linux-specific fix would address that. I do think that's
certainly an important, and perhaps the most important, issue that this
solves, but there's other cases where this would be really helpful.

> > How exactly this would work is unclear to me; maybe one
> > process keeps an eye on it in an OS-specific manner,

There seems to be a lot of focus on trying to implement this as "get the
amount of free memory from the OS and make sure we don't go over that
limit" and that adds a lot of OS-specific logic which complicates things
and also ignores the use-cases where an admin wishes to limit PG's
memory usage due to other processes running on the same system. I'll
point out that the LD_PRELOAD library doesn't even attempt to do this,
even though it's explicitly for Linux, but uses an environment variable
instead.

In PG, we'd have that be a GUC that an admin is able to set and then we
track the memory usage (perhaps per-process, perhaps using some set of
buckets, perhaps locally per-process and then in a smaller number of
buckets in shared memory, or something else) and fail an allocation when
it would go over that limit, perhaps only when it's a regular user
backend or with other conditions around it.

> What would be useful is a way for Postgres to count the amount of memory
> allocated by each backend. This could be advantageous for giving per-backend
> memory usage to the user, as well as for enforcing a limit on the total amount
> of memory allocated by the backends.

I agree that this would be independently useful.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2022-05-20 19:52:25 Re: Inquiring about my GSoC Proposal.
Previous Message Michail Nikolaev 2022-05-20 18:50:32 CPU time for pg_stat_statement