Re: 9.5: Better memory accounting, towards memory-bounded HashAgg

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
Date: 2014-08-22 18:13:01
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20.8.2014 08:11, Jeff Davis wrote:
> On Tue, 2014-08-19 at 12:54 +0200, Tomas Vondra wrote:
>> The use-case for this is tracking a chosen subtree of contexts - e.g.
>> aggcontext and below, so I'd expect the tracked subtrees to be relatively
>> shallow. Am I right?
> Right.
>> My fear is that by removing the inheritance bit, we'll hurt cases with a
>> lot of child contexts. For example, array_agg currently creates a separate
>> context for each group - what happens if you have 100k groups and do
>> MemoryContextGetAllocated? I guess iterating over 100k groups is not free.
> Good point. We don't want to make checking the allocated space into an
> expensive operation, because then we will be forced to call it less
> frequently, which implies that we'd be sloppier about actually fitting
> in work_mem.
>> Wouldn't the solution with inheritance and propagating the accounting info
>> to the parent actually better? Or maybe allowing both, having two flags
>> when creating a context instead of one?
> That seems overly complicated. We only have one driving use case, so two
> orthogonal options sounds excessive. Perhaps one option if we can't
> solve the performance problem and we need to isolate the changes to
> hashagg.
>> Also, do we really need to track allocated bytes - couldn't we track
>> kilobytes or something and use smaller data types to get below the 64B?
> Good idea.
> I attached a patch that uses two uint32 fields so that it doesn't
> increase the size of MemoryContextData, and it tracks memory usage for
> all contexts. I was unable to detect any performance regression vs.
> master, but on my machine the results are noisy.

I don't think we really need to abandon the 'tracked' flag (or that we
should). I think it was useful, and removing it might be one of the
reasons why Robert now sees worse impact than before.

I assume you did that to get below 64B, right? What about to changing
'isReset' in MemoryContextData to a 'flags' field instead? It's a bool
now, i.e. 1B -> 8 bits.

I don't think it makes sense to abandon this only to get <= 64B, if it
forces you to touch multiple 64B structures unnecessarily.

Also, you're doing this for every block in AllocSetReset:

update_allocation(context, -(block->endptr - ((char*) block)));

So if there are 1000 blocks, you'll call that 1000x (and it will
propagate up to TopMemoryContext). Instead keep a local sum and only do
the call once at the end. Again, this might be one of the reasons why
Robet sees ~4% overhead.

BTW what happens when AllocSetContext is created with initBlockSize that
is not a multiple of 1kB? I see it's enforced to be at least 1024, but I
don't see anything forcing it to be a multiple of 1024?

What if I create a context with initBlockSize=1500? ISTM it'll break the
accounting, right? (Not that it would make sense to create such
contexts, but it's possible.)


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-08-22 18:32:38 Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED
Previous Message Tom Lane 2014-08-22 18:02:42 Re: WIP Patch for GROUPING SETS phase 1