RFC: list API / memory allocations

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: RFC: list API / memory allocations
Date: 2011-11-18 20:47:18
Message-ID: 201111182147.19104.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi List,

In many scenarios memory allocation is one of the top 3 functions showing up
in profiles. Looking at hierarchical profiles (-fno-omit-frame-pointer) at least
during parsing, planning and executor startup most of that is spent around the
list API.

Many - especially in the parse-analyze phase - of those allocations can be
avoided because the lists are immutable and their size is known upfront.

Some examples:
parser: list_make$n, buildRelationAliases
planner: expression_tree_mutator, get_relation_info;
executor: ExecInitExpr

For that I added new functions/defines which allocate all the needed memory in
one hunk:
list_immutable_make$n(),
List *list_new_immutable_n(NodeTag t, size_t n);
List *list_make_n(NodeTag t, size_t n, ...);

With those I could avoid about 1/3 of the allocations in some example
scenarios (pgbench, custom benchmarks) by replacing a few notable callsites to
use statically allocated lists.

The obvious problem with that approach is that approach is that those List's
and ListCell's cannot be individually deallocated. Which especially in the
planner isn't done anyway.
To check that I added an allocate_only to both when USE_ASSERT_CHECKING.

Using that approach I did measure improvements between 0.5-20% depending on
the statement (using -M simple). Complex statements which are quick to execute
and not too complicated to plan are the ones benefiting most. Which is not
surprising.

The questions I have here:
* is that approach acceptable? I converted a the most notable callsites and
the benefit is quite measurable. On the other hand one has to be rather careful
when analyzing whether lists will be deallocated later on. Also its touching
rather much code

* I plan to add list_copy_immutable as another function as that is the next
biggest scenario where memory is allocated in forseeable scenarios.

* any suggestions what name to use instead of immutable? That doesn't really
cut the real meaning of "allocate only". I didn't find a name though.

Then there is the 2/3 of calls which I couldn't beat with that approach.
Mostly because the size of the lists is too hard to figure out.
My current approach to that would be to preallocate listcells by some caller
defined amount.
A *very* quick hack which always overallocates Lists to contain 10 elements
yields promising results (around another 2-20% of parse only time).

The problem with that is that is that the most sensible way seems to be to
define the amount of preallocation per list. Which is currently not easily
representable because empty lists are just 0/NILL.

So I have 3 solutions. All of which are not that great:
* allow empty Lists. This possibly requires a bit of code churn but seems
somewhat sensible.
* Save the amount of preallocation needed in NIL by defining that no pointer
can be less than say 0x100. That also requires some churn and gets points for
uglyness
* always overallocate new lists. This is rather wasteful.

Any ideas?

if anybody wants the preliminary patches I am happy to send them but I would
much rather prefer to make them somewhat after input.

Andres

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-11-18 21:11:29 Re: RFC: list API / memory allocations
Previous Message Josh Berkus 2011-11-18 20:36:59 Re: Core Extensions relocation