Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Date: 2013-01-09 22:14:52
Message-ID: 20130109221452.GA28653@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-09 15:43:19 -0500, Tom Lane wrote:
> I wrote:
> > I then applied the palloc.h and mcxt.c hunks of your patch and rebuilt.
> > Now I get an average runtime of 16666 ms, a full 2% faster, which is a
> > bit astonishing, particularly because the oprofile results haven't moved
> > much:
>
> I studied the assembly code being generated for palloc(), and I believe
> I see the reason why it's a bit faster: when there's only a single local
> variable that has to survive over the elog call, gcc generates a shorter
> function entry/exit sequence.

Makes sense.

> I had thought of proposing that we code
> palloc() like this:
>
> void *
> palloc(Size size)
> {
> MemoryContext context = CurrentMemoryContext;
>
> AssertArg(MemoryContextIsValid(context));
>
> if (!AllocSizeIsValid(size))
> elog(ERROR, "invalid memory alloc request size %lu",
> (unsigned long) size);
>
> context->isReset = false;
>
> return (*context->methods->alloc) (context, size);
> }
>
> but at least on this specific hardware and compiler that would evidently
> be a net loss compared to direct use of CurrentMemoryContext. I would
> not put a lot of faith in that result holding up on other machines
> though.

Thats not optimized to the same? ISTM the compiler should produce
exactly the same code for both.

> In any case this doesn't explain the whole 2% speedup, but it probably
> accounts for palloc() showing as slightly cheaper than
> MemoryContextAlloc had been in the oprofile listing.

I'd guess that a good part of the cost is just smeared across all
callers and not individually accountable to any function visible in the
profile. Additionally, With functions as short as MemoryContextAllocZero
profiles like oprofile (and perf) also often leak quite a bit of the
actual cost to the callsites in my experience.

I wonder whether it makes sense to "inline" the contents pstrdup()
additionally? My gut feeling is not, but...

I would like to move CurrentMemoryContext to memutils.h, but that seems
to require too many changes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-01-09 22:22:29 Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Previous Message Tom Lane 2013-01-09 22:09:54 Re: Index build temp files