Re: 9.5: Better memory accounting, towards memory-bounded HashAgg

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
Date: 2014-08-16 21:09:11
Message-ID: 53EFC877.6030103@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16.8.2014 20:00, Tom Lane wrote:
> Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
>> I believe this should check parent->track_mem, just like
>> update_allocation, because this way it walks all the parent context up
>> to the TopMemoryContext.
>
> TBH, I don't think this "opt-in tracking" business has been thought
> through even a little bit; for example, what happens if there is a
> declared-to-not-track context in between levels that are declared to
> track? And it certainly has not been proven that the extra complexity
> actually saves cycles rather than the reverse.

AFAIK there's no other built-in accounting mechanism at the moment.
That's one of the reasons why some of the nodes had to invent their own
custom accounting - hashjoin is an example of that.

We do have MemoryContextStats, which essentially walks the memory
context hierarchy and accounts all the important information. While
working on the hashjoin patch, I was thinking that maybe we could create
MemoryContextStats clone, computing the accounting info on request.

So I did an experiment, writing this new MemoryContextStats clone and
rewriting the nodeHash.c to use this instead of custom tracking. The
impact was consistently ~2-3%, for the hashjoin queries I measured (e.g.
9100 ms instead of 8900 ms etc.).

I just tried the same with the hashjoin patch applied. With the built-in
accounting the queries now run ~2.5x faster - around 3500ms. With the
MemoryContextStats clone, it takes a whopping 18000 ms.

I don't know what causes such increase (I assume I use the accounting
info within a loop or something). The point is that with Jeff's patch,
it goes down to 3500 ms, with pretty much zero overhead.

I'll check why the patched hashjoin is so much slower, but even if I can
optimize it somehow I don't think I'll get it to 0 overhead. So I'd say
this accounting actally saves the cycles.

Of course, that does not prove it's well a well thought-out design,
although it seems sensible to me. It's the least complex accounting
implementation I can think of.

For example, I don't think being able to create nonsensical hierarchies
is a major flaw. We certainly should not do that in our code, and we
should take reasonable precautions to prevent this in a user code (as
for example inheriting the tracking flag in MemoryContextCreate, as Jeff
does). Can this be seen as a foot gun? Maybe, but we're handing
ammunition to users on a daily basis - for example they can switch to
TopMemoryContext, but that does not make memory contexts bad.

But maybe the inheritance really is not necessary - maybe it would be
enough to track this per-context, and then just walk through the
contexts and collect this. Because my observation is that most of the
time is actually spent in walking through blocks and freelists.

Then the issues with nonsensical hierarchies would disappear, and also
the possible overhead with updating all parent hierarchies would go away
(not sure if it's even measurable, though).

regards
Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-08-17 06:29:37 Re: Patch to support SEMI and ANTI join removal
Previous Message Tom Lane 2014-08-16 18:00:48 Re: 9.5: Better memory accounting, towards memory-bounded HashAgg