Re: 9.5: Better memory accounting, towards memory-bounded HashAgg

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
Date: 2014-11-17 16:50:22
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15.11.2014 22:36, Simon Riggs wrote:
> On 16 October 2014 02:26, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> The inheritance is awkward anyway, though. If you create a tracked
>> context as a child of an already-tracked context, allocations in
>> the newer one won't count against the original. I don't see a way
>> around that without introducing even more performance problems.
> This seems to have reached impasse, which is a shame. Let me throw
> out a few questions to see if that might help.
> Do I understand correctly that we are trying to account for exact
> memory usage at palloc/pfree time? Why??
> Surely we just want to keep track of the big chunks? Does this need
> to be exact? How exact does it need to be?

What do you mean by "big chunks"? Blocks, or really over-sized chunks
(over 8kB)? I assume blocks, and that's what the proposed patches are doing.

> Or alternatively, can't we just sample the allocations to reduce
> the overhead?

That might work, but I'm afraid it's trickier than it might seem.

> Are there some assumptions we can make about micro-contexts never
> needing to be tracked at all?

What is a micro-context?

> Jeff seems not too bothered by inheritance, whereas Tomas thinks its
> essential. Perhaps there is a middle ground, depending upon the role
> of the context?

Having a hierarchy of memory contexts and an accounting completely
ignoring that hierarchy seems a bit strange to me.

Maybe we can impose some restrictions, for example:

(a) a single context with tracking explicitly requested (in the whole

(b) when tracking is requested for a context, the parent context must
not have tracking enabled

Either of these restrictions would prevent a situation where a context
has to update accounting for two parent contexts. That'd allow updating
a single place (because each context has a single parent context with
tracking requested).

But I'm not sure that those restrictions are acceptable, that we can
enforce them reliably, or how frequently that'd break external code
(e.g. extensions with custom aggregates etc.). Because this is part of
the public API, including the option to enable accounting for a memory

Imagine you write an extension with a custom aggregate (or whatever) and
it works just fine. Then we tweak something in the executor, requesting
accounting for another context, and the extension suddenly breaks.

Or maybe we don't change anything, and the aggregate works fine with
Group Aggregate, but breaks whenever the plan switches to Hash Aggregate
(because that's using accounting).

Or maybe the user just redirects the tracking somewhere else (e.g. by
supplying a pointer to the v6 of the patch), excluding the context from
the accounting entirely.

So yeah, I think the inheritance is an important aspect, at least for a
generic accounting infrastructure. I'd bet that if we choose to do that,
there'll be a good deal of various WTF moments and foot-shooting in the
near future ...

On 23/8 I proposed to use a separate hierarchy of accounting info,
parallel to the memory context hierarchy (but only for contexts with
tracking explicitly requested)

but no response so far. I do see ~1% regression on Robert's benchmark
(on x86), but it's quite noisy. Not sure about PPC, which is what Robert
used. I don't see how to make it faster without sacrificing full
inheritance support ...


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-11-17 17:04:55 Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
Previous Message Christoph Berg 2014-11-17 16:38:38 Re: [HACKERS] ltree::text not immutable?