Re: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Date: 2013-01-09 20:07:10
Message-ID: 705.1357762030@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Well, I *did* benchmark it as noted elsewhere in the thread, but thats
> obviously just machine (E5520 x 2) with one rather restricted workload
> (pgbench -S -jc 40 -T60). At least its rather palloc heavy.

> Here are the numbers:

> before:
> #101646.763208 101350.361595 101421.425668 101571.211688 101862.172051 101449.857665
> after:
> #101553.596257 102132.277795 101528.816229 101733.541792 101438.531618 101673.400992

> So on my system if there is a difference, its positive (0.12%).

pgbench-based testing doesn't fill me with a lot of confidence for this
--- its numbers contain a lot of communication overhead, not to mention
that pgbench itself can be a bottleneck. It struck me that we have a
recent test case that's known to be really palloc-intensive, namely
Pavel's example here:
http://www.postgresql.org/message-id/CAFj8pRCKfoz6L82PovLXNK-1JL=jzjwaT8e2BD2PwNKm7i7KVg@mail.gmail.com

I set up a non-cassert build of commit
78a5e738e97b4dda89e1bfea60675bcf15f25994 (ie, just before the patch that
reduced the data-copying overhead for that). On my Fedora 16 machine
(dual 2.0GHz Xeon E5503, gcc version 4.6.3 20120306 (Red Hat 4.6.3-2))
I get a runtime for Pavel's example of 17023 msec (average over five
runs). I then applied oprofile and got a breakdown like this:

samples| %|
------------------
108409 84.5083 /home/tgl/testversion/bin/postgres
13723 10.6975 /lib64/libc-2.14.90.so
3153 2.4579 /home/tgl/testversion/lib/postgresql/plpgsql.so

samples % symbol name
10960 10.1495 AllocSetAlloc
6325 5.8572 MemoryContextAllocZeroAligned
6225 5.7646 base_yyparse
3765 3.4866 copyObject
2511 2.3253 MemoryContextAlloc
2292 2.1225 grouping_planner
2044 1.8928 SearchCatCache
1956 1.8113 core_yylex
1763 1.6326 expression_tree_walker
1347 1.2474 MemoryContextCreate
1340 1.2409 check_stack_depth
1276 1.1816 GetCachedPlan
1175 1.0881 AllocSetFree
1106 1.0242 GetSnapshotData
1106 1.0242 _SPI_execute_plan
1101 1.0196 extract_query_dependencies_walker

I then applied the palloc.h and mcxt.c hunks of your patch and rebuilt.
Now I get an average runtime of 16666 ms, a full 2% faster, which is a
bit astonishing, particularly because the oprofile results haven't moved
much:

107642 83.7427 /home/tgl/testversion/bin/postgres
14677 11.4183 /lib64/libc-2.14.90.so
3180 2.4740 /home/tgl/testversion/lib/postgresql/plpgsql.so

samples % symbol name
10038 9.3537 AllocSetAlloc
6392 5.9562 MemoryContextAllocZeroAligned
5763 5.3701 base_yyparse
4810 4.4821 copyObject
2268 2.1134 grouping_planner
2178 2.0295 core_yylex
1963 1.8292 palloc
1867 1.7397 SearchCatCache
1835 1.7099 expression_tree_walker
1551 1.4453 check_stack_depth
1374 1.2803 _SPI_execute_plan
1282 1.1946 MemoryContextCreate
1187 1.1061 AllocSetFree
...
653 0.6085 palloc0
...
552 0.5144 MemoryContextAlloc

The number of calls of AllocSetAlloc certainly hasn't changed at all, so
how did that get faster?

I notice that the postgres executable is about 0.2% smaller, presumably
because a whole lot of inlined fetches of CurrentMemoryContext are gone.
This makes me wonder if my result is due to chance improvements of cache
line alignment for inner loops.

I would like to know if other people get comparable results on other
hardware (non-Intel hardware would be especially interesting). If this
result holds up across a range of platforms, I'll withdraw my objection
to making palloc a plain function.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-01-09 20:15:46 Re: Index build temp files
Previous Message Stephen Frost 2013-01-09 20:00:19 Re: Index build temp files