|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|Subject:||POC: converting Lists into arrays|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
For quite some years now there's been dissatisfaction with our List
data structure implementation. Because it separately palloc's each
list cell, it chews up lots of memory, and it's none too cache-friendly
because the cells aren't necessarily adjacent. Moreover, our typical
usage is to just build a list by repeated lappend's and never modify it,
so that the flexibility of having separately insertable/removable list
cells is usually wasted.
Every time this has come up, I've opined that the right fix is to jack
up the List API and drive a new implementation underneath, as we did
once before (cf commit d0b4399d81). I thought maybe it was about time
to provide some evidence for that position, so attached is a POC patch
that changes Lists into expansible arrays, while preserving most of
their existing API.
The big-picture performance change is that this makes list_nth()
a cheap O(1) operation, while lappend() is still pretty cheap;
on the downside, lcons() becomes O(N), as does insertion or deletion
in the middle of a list. But we don't use lcons() very much
(and maybe a lot of the existing usages aren't really necessary?),
while insertion/deletion in long lists is a vanishingly infrequent
operation. Meanwhile, making list_nth() cheap is a *huge* win.
The most critical thing that we lose by doing this is that when a
List is modified, all of its cells may need to move, which breaks
a lot of code that assumes it can insert or delete a cell while
hanging onto a pointer to a nearby cell. In almost every case,
this takes the form of doing list insertions or deletions inside
a foreach() loop, and the pointer that's invalidated is the loop's
current-cell pointer. Fortunately, with list_nth() now being so cheap,
we can replace these uses of foreach() with loops using an integer
index variable and fetching the next list element directly with
list_nth(). Most of these places were loops around list_delete_cell
calls, which I replaced with a new function list_delete_nth_cell
to go along with the emphasis on the integer index.
I don't claim to have found every case where that could happen,
although I added debug support in list.c to force list contents
to move on every list modification, and this patch does pass
check-world with that support turned on. I fear that some such
bugs remain, though.
There is one big way in which I failed to preserve the old API
syntactically: lnext() now requires a pointer to the List as
well as the current ListCell, so that it can figure out where
the end of the cell array is. That requires touching something
like 150 places that otherwise wouldn't have had to be touched,
which is annoying, even though most of those changes are trivial.
I thought about avoiding that by requiring Lists to keep a "sentinel"
value in the cell after the end of the active array, so that lnext()
could look for the sentinel to detect list end. However, that idea
doesn't really work, because if the list array has been moved, the
spot where the sentinel had been could have been reallocated and
filled with something else. So this'd offer no defense against the
possibility of a stale ListCell pointer, which is something that
we absolutely need defenses for. As the patch stands we can have
quite a strong defense, because we can check whether the presented
ListCell pointer actually points into the list's current data array.
Another annoying consequence of lnext() needing a List pointer is that
the List arguments of foreach() and related macros are now evaluated
each time through the loop. I could only find half a dozen places
where that was actually unsafe (all fixed in the draft patch), but
it's still bothersome. I experimented with ways to avoid that, but
the only way I could get it to work was to define foreach like this:
#define foreach(cell, l) for (const List *cell##__foreach = foreach_setup(l, &cell); cell != NULL; cell = lnext(cell##__foreach, cell))
static inline const List *
foreach_setup(const List *l, ListCell **cell)
*cell = list_head(l);
That works, but there are two problems. The lesser one is that a
not-very-bright compiler might think that the "cell" variable has to
be forced into memory, because its address is taken. The bigger one is
that this coding forces the "cell" variable to be exactly "ListCell *";
you can't add const or volatile qualifiers to it without getting
compiler warnings. There are actually more places that'd be affected
by that than by the need to avoid multiple evaluations. I don't think
the const markings would be a big deal to lose, and the two places in
do_autovacuum that need "volatile" (because of a nearby PG_TRY) could
be rewritten to not use foreach. So I'm tempted to do that, but it's
not very pretty. Maybe somebody can think of a better solution?
There's a lot of potential follow-on work that I've not touched yet:
1. This still involves at least two palloc's for every nonempty List,
because I kept the header and the data array separate. Perhaps it's
worth allocating those in one palloc. However, right now there's an
assumption that the header of a nonempty List doesn't move when you
change its contents; that's embedded in the API of the lappend_cell
functions, and more than likely there's code that depends on that
silently because it doesn't bother to store the results of other
List functions back into the appropriate pointer. So if we do that
at all I think it'd be better tackled in a separate patch; and I'm
not really convinced it's worth the trouble and extra risk of bugs.
2. list_qsort() is now absolutely stupidly defined. It should just
qsort the list's data array in-place. But that requires an API
break for the caller-supplied comparator, since there'd be one less
level of indirection. I think we should change this, but again it'd
be better done as an independent patch to make it more visible in the
3. There are a number of places where we've built flat arrays
paralleling Lists, such as the planner's simple_rte_array. That's
pointless now and could be undone, buying some code simplicity.
Various other micro-optimizations could doubtless be done too;
I've not looked hard.
I haven't attempted any performance measurements on this, but at
least in principle it should speed things up a bit, especially
for complex test cases involving longer Lists. I don't have any
very suitable test cases at hand, anyway.
I think this is too late for v12, both because of the size of the
patch and because of the likelihood that it introduces a few bugs.
I'd like to consider pushing it early in the v13 cycle, though.
regards, tom lane
|Next Message||David Rowley||2019-02-24 07:21:04||Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath|
|Previous Message||Andres Freund||2019-02-24 00:18:54||Re: \describe*|