Re: Improving the memory allocator

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the memory allocator
Date: 2011-04-26 00:12:11
Message-ID: 201104260212.11833.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, April 26, 2011 01:39:37 AM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > So after all this my question basically is: How important do we think the
> > mctx.c abstraction is?
>
> I think it's pretty important. As a specific example, a new context
> type in which pfree() is a no-op would be fine with me. A new context
> type in which pfree() dumps core will be useless, or close enough to
> useless.
Well, what I suggested for that would be using a different api for such small
+ static size allocations.
But as I said, I would prefer to exhaust other possibilities first.

> That means you can't get rid of the per-chunk back-link to the
> context, but you might be able to get rid of the other overhead such as
> per-chunk size data. (It might be practical to not support
> GetMemoryChunkSize for such contexts, or if it's a slab allocator then
> you could possibly know that all the chunks in a given block have size X.)
For the slab allocator design I have in mind I would need to have a back
pointer to the block, not the context... Thats one other reason why I started
thinking about removing the abstraction.
So far I couldn't envision a clean design where you can intermix two
implementations with such a different interpretations. One could have the
blocks and contexts have a 'allocator' node tag in the first element and
switch over that but I don't really like that.

And I don't see a way with that abstraction to let the compiler do expensive
stuff like the index offset determination at compile time instead of run time
with that abstraction. And I think pulling such computations out of runtime is
quite an important part of improvements in that area.

> Another point worth making is that it's a nonstarter to propose running
> any large part of the system in memory context types that are incapable
> of supporting all the debugging options we rely on (freed-memory-reuse
> detection and write-past-end-of-chunk in particular). It's okay if a
> production build hasn't got that support, not okay for debug builds.
Totally with you. I have absolutely no problem of enlarging the chunkheader
for debug builds and I can't envision a design where that would be a major
problem.

> Perhaps you'll propose using completely different context
> implementations in the two cases, but I'd be suspicious of that because
> it'd mean the "fast" context code doesn't get any developer testing.
I don't like that option either. Its *way* to easy to screw up slightly in
that area.

> > Especially as I hope its possible to write a single allocator
> > which is "good enough" for everything.
> I'll lay a side bet that that approach is a dead end. If one size fits
> all were good enough, we'd not be having this conversation at all. The
> point of the mctx interface layer from the beginning was to support
> multiple allocator policies, and that's the direction I think we want to
> go to improve this.
I don't think I am with you here. I am not around that long so I might be
missing something but I haven't found much evidence of somebody trying to
improve the allocator on a whole instead of improving currently problematic
pieces for 10+ years. So I don't see there is enough evidence proving that
there isn't a possibility to envision an allocator thats good enough for all
needs.

I quite much fear having to figure out which allocator to use where. I don't
see that working very well.

> BTW, what your numbers actually suggest to me is not that we need a
> better allocator, but that we need a better implementation of List.
> We speculated back when Neil redid List the first time about aggregating
> list cells to reduce palloc traffic, but it was left out to keep the
> patch complexity down. Now that the bugs have been shaken out it might
> be time to have another go at that. In particular, teaching List to
> allocate the list head and first cell together would alone remove a
> third of your runtime ...
Thats certainly true for that workload. The hotspot is somewhere else entirely
though if you start doing even mildly more complex statements than the default
readonly statements from pgbench.
Its actually not totally easy finding any workload thats not totally IO bound
where memory allocation is not in the top 5 in a profile...

But sure. Improving that point is a good idea independent from the allocator.
One less allocation won't hurt any allocator ;)

Greetings,

Andres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-04-26 00:19:08 pg_upgrade cleanup
Previous Message Tom Lane 2011-04-26 00:02:34 Re: wrong hint message for ALTER FOREIGN TABLE