Skip site navigation (1) Skip section navigation (2)

Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>,pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Date: 2013-01-09 22:14:52
Message-ID: 20130109221452.GA28653@awork2.anarazel.de (view raw or flat)
Thread:
Lists: pgsql-hackers
On 2013-01-09 15:43:19 -0500, Tom Lane wrote:
> I wrote:
> > I then applied the palloc.h and mcxt.c hunks of your patch and rebuilt.
> > Now I get an average runtime of 16666 ms, a full 2% faster, which is a
> > bit astonishing, particularly because the oprofile results haven't moved
> > much:
>
> I studied the assembly code being generated for palloc(), and I believe
> I see the reason why it's a bit faster: when there's only a single local
> variable that has to survive over the elog call, gcc generates a shorter
> function entry/exit sequence.

Makes sense.

>  I had thought of proposing that we code
> palloc() like this:
>
> void *
> palloc(Size size)
> {
>     MemoryContext context = CurrentMemoryContext;
>
>     AssertArg(MemoryContextIsValid(context));
>
>     if (!AllocSizeIsValid(size))
>         elog(ERROR, "invalid memory alloc request size %lu",
>              (unsigned long) size);
>
>     context->isReset = false;
>
>     return (*context->methods->alloc) (context, size);
> }
>
> but at least on this specific hardware and compiler that would evidently
> be a net loss compared to direct use of CurrentMemoryContext.  I would
> not put a lot of faith in that result holding up on other machines
> though.

Thats not optimized to the same? ISTM the compiler should produce
exactly the same code for both.

> In any case this doesn't explain the whole 2% speedup, but it probably
> accounts for palloc() showing as slightly cheaper than
> MemoryContextAlloc had been in the oprofile listing.

I'd guess that a good part of the cost is just smeared across all
callers and not individually accountable to any function visible in the
profile. Additionally, With functions as short as MemoryContextAllocZero
profiles like oprofile (and perf) also often leak quite a bit of the
actual cost to the callsites in my experience.

I wonder whether it makes sense to "inline" the contents pstrdup()
additionally? My gut feeling is not, but...

I would like to move CurrentMemoryContext to memutils.h, but that seems
to require too many changes.

Greetings,

Andres Freund

-- 
 Andres Freund	                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


In response to

Responses

pgsql-hackers by date

Next:From: Andres FreundDate: 2013-01-09 22:22:29
Subject: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Previous:From: Tom LaneDate: 2013-01-09 22:09:54
Subject: Re: Index build temp files

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group