Skip site navigation (1) Skip section navigation (2)

Re: embedded list v3

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 21:34:09
Message-ID: 201209302334.09495.andres@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

On Sunday, September 30, 2012 10:33:28 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > Current version is available at branch ilist in:
> > git://git.postgresql.org/git/users/andresfreund/postgres.git
> > ssh://git(at)git(dot)postgresql(dot)org/users/andresfreund/postgres.git
> 
> I'm still pretty desperately unhappy with your insistence on circularly
> linked dlists. Not only does that make initialization problematic,
> but now it's not even consistent with slists.
The slist code originally grew out of the dlist code and thus was pretty 
similar, but at some point (after your dislike of the circular lists?, not 
sure) I thought about it again and found no efficiency differences for any of 
the common manipulations in single linked lists because you don't need to deal 
with prev and tail pointers, so I saw no point in insisting there. No external 
user should care.

> A possible compromise that would fix the initialization issue is to
> allow an empty dlist to be represented as *either* two NULL pointers
> or a pair of self-pointers.  Conversion from one case to the other
> could happen like this:

> It appears to me that of the inline'able functions, only dlist_push_head
> and dlist_push_tail would need to do this; the others presume nonempty
> lists and so wouldn't have to deal with the NULL/NULL case.
> dlist_is_empty would need one extra test too.
The problem is that dlist_push_head/tail and and dlist_is_empty are prety hot 
code paths.

In transaction reassembly/wal decoding for every wal record that we need to 
look at in the context of a transaction the code very roughly does something 
like:

/* get change */
Change *change;

if (dlist_is_empty(&apply_cache->cached_changes))
     change = dlist_container(..., dlist_pop_head_node(&apply_cache-
>cached_changes))
else
     change = malloc(...);

/* get data from wal */
fill_change_change(current_wal_record, change);

/* remember change, till TX is complete */
dlist_push_tail(tx->changes, change->node);

(In reality more of those happen, but anyway)

We literally have tens of thousands list manipulation a second if the server is 
busy. Iteration only happens once a XLOG_COMMIT/ABORT is found (in combination 
with merging the subtransaction's changes).


In the slab allocator I originally built this for it was exactly the same. The 
lists are manipulated for every palloc/pfree but only iterated at 
MemoryContextReset.

I am really sorry for being stubborn here, but I changed to circular lists 
after profiling and finding that pipeline stalls & misprediced branches where 
the major thing I could change. Not sure how we can resolv this :(

> BTW, the "fast path" in dlist_move_head can't be right can it?
Yea, its crap, thanks for noticing. Shouldn't cause any issues except being 
slower, thats why I probably didn't notice I broke it at some point. ->next is 
missing.

Greetings,

Andres
-- 
 Andres Freund	                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


In response to

Responses

pgsql-hackers by date

Next:From: Andres FreundDate: 2012-09-30 21:42:56
Subject: Re: embedded list v3
Previous:From: Tom LaneDate: 2012-09-30 20:48:01
Subject: Re: embedded list v3

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group