Re: Memory Accounting v11

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory Accounting v11
Date: 2015-06-14 22:28:38
Message-ID: 557E0016.8050300@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 06/14/15 21:43, Jeff Davis wrote:
> This patch tracks memory usage (at the block level) for all memory
> contexts. Individual palloc()s aren't tracked; only the new blocks
> allocated to the memory context with malloc().

I see it's adding the new field as int64 - wouldn't a Size be more
appropriate, considering that's what we use in mctx.h and aset.c?

> It also adds a new function, MemoryContextMemAllocated() which can
> either retrieve the total allocated for the context, or it can
> recurse and sum up the allocations for all subcontexts as well.
>
> This is intended to be used by HashAgg in an upcoming patch that will
> bound the memory and spill to disk.
>
> Previous discussion here:
>
> http://www.postgresql.org/message-id/1407012053.15301.53.camel@jeff-desktop
>
> Previous concerns:
>
> * There was a slowdown reported of around 1-3% (depending on the exact
> version of the patch) on an IBM power machine when doing an index
> rebuild. The results were fairly noisy for me, but it seemed to hold up.
> See http://www.postgresql.org/message-id/CA
> +Tgmobnu7XEn1gRdXnFo37P79bF=qLt46=37ajP3Cro9dBRaA(at)mail(dot)gmail(dot)com
> * Adding a struct field to MemoryContextData may be bad for the CPU
> caching behavior, and may be the cause of the above slowdown.

> * Previous versions of the patch updated the parent contexts'
> allocations as well, which avoided the need to recurse when querying
> the total allocation. That seemed to penalize cases that didn't need
> to track the allocation. We discussed trying to "opt-in" to this
> behavior, but it seemed more awkward than helpful. Now, the patch
> only updates the allocation of the current context, and querying
> means recursing through the child contexts.

I don't think the opt-in idea itself was awkward, it was rather about
the particular APIs that we came up with, especially when combined with
the 'context inheritance' thing.

I still think the opt-in approach and updating accounting for the parent
contexts was the best one, because it (a) minimizes impact in cases that
don't use the accounting, and (b) makes finding the current amount of
memory cheap ...

> * There was a concern that, if MemoryContextMemAllocated needs to
> recurse to the child contexts, it will be too slow for HashAgg of
> array_agg, which creates a child context per group. That was solved with
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b419865a814abbca12bdd6eef6a3d5ed67f432e1

I wouldn't say this was "solved" - we have fixed one particular example
of such aggregate implementation, because it was causing OOM issues with
many groups, but there may be other custom aggregates using the same
pattern.

Granted, built-in aggregates are probably more critical than aggregates
provided by extensions, but I wouldn't dare to mark this solved ...

> My general answer to the performance concerns is that they aren't a
> reason to block this patch, unless someone has a suggestion about how
> to fix them. Adding one field to a struct and a few arithmetic
> operations for each malloc() or realloc() seems reasonable to me.

I'm not buying this, sorry. While I agree that we should not expect the
memory accounting to be entirely free, we should be very careful about
the overhead especially if we're dropping the opt-in and thus imposing
the overhead on everyone.

But "performance concerns are not a reason to block this patch" approach
seems wrong. With any other patch a 3% regression would be considered a
serious issue IMNSHO.

> The current state, where HashAgg just blows up the memory, is just
> not reasonable, and we need to track the memory to fix that problem.
> Others have also mentioned that we might want to use this mechanism
> to track memory for other operators, like Sort or HashJoin, which
> might be simpler and more accurate.

Dropping the memory accounting implementations and keeping just this new
solution would be nice, only if we agree the performance impact to be
acceptable. We already have accounting solution for each of those
places, so I don't think the unification alone outweighs the regression.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Nolan 2015-06-14 22:35:01 Re: On columnar storage
Previous Message Jan de Visser 2015-06-14 22:16:10 Re: Git humor