Re: embedded list v2

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-16 14:23:14
Message-ID: 201209161623.14340.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Saturday, September 15, 2012 07:21:44 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote:
> >> Well, actually, that just brings us to the main point which is: I do not
> >> believe that circular links are a good design choice here.
> >
> > I think I have talked about the reasoning on the list before, but here it
> > goes: The cases where I primarily used them are cases where the list
> > *manipulation* is a considerable part of the expense. In those situations
> > having less branches is beneficial and, to my knowledge, cannot be done
> > in normal flat lists.
> > For the initial user of those lists - the slab allocator for postgres
> > which I intend to finish at some point - the move to circular lists
> > instead of classical lists was an improvement in the 12-15% range.
>
> Let me make my position clear: I will not accept performance as the sole
> figure of merit for this datatype. It also has to be easy and reliable
> to use, and the current design is a failure on those dimensions.
> This question of ease and reliability of use isn't just academic, since
> you guys had trouble converting catcache.c without introducing bugs.
> That isn't exactly a positive showing for this design.
Uhm. I had introduced a bug there, true (Maybe Alvaro as well, I can't remember
anything right now). But it was something like getting the tail instead of the
head element due to copy paste. Nothing will be able to protect the code from
me.

> Furthermore, one datapoint for one use-case (probably measured on only
> one CPU architecture) isn't even a very convincing case for the
> performance being better. To take a handy example, if we were to
> convert dynahash to use dlists, having to initialize each hash bucket
> header this way would probably be a pretty substantial cost for a lot
> of hashtable use-cases. We have a lot of short-lived dynahash tables.

What do you think about doing something like:

#define DLIST_INIT(name) {{&name.head, &name.head}}
static dlist_head DatabaseList = DLIST_INIT(DatabaseList);

That works, but obviously the initialization will have to be performed at some
point.

> This also ties into the other problem, since it's hard to code such
> loop control as a macro without multiple evaluation of the list-head
> argument. To me that's a show stopper of the first order. I'm not
> going to accept a replacement for foreach() that introduces bug hazards
> that aren't in foreach().
What do you think about something like:

typedef struct dlist_iter
{
/*
* Use a union with equivalent storage as dlist_node to make it possible to
* initialize the struct inside a macro without multiple evaluation.
*/
union {
struct {
dlist_node *cur;
dlist_node *end;
};
dlist_node init;
};
} dlist_iter;

typedef struct dlist_mutable_iter
{
union {
struct {
dlist_node *cur;
dlist_node *end;
};
dlist_node init;
};
dlist_node *next;
} dlist_mutable_iter;

#define dlist_iter_foreach(iter, ptr) \
for (iter.init = (ptr)->head; iter.cur != iter.end; \
iter.cur = iter.cur->next)

#define dlist_iter_foreach_modify(iter, ptr) \
for (iter.init = (ptr)->head, iter.next = iter.cur->next; \
iter.cur != iter.end \
iter.cur = iter.next, iter.next = iter.cur->next)

With that and some trivial changes *all* multiple evaluation possibilities are
gone.

(_iter_ in there would go, thats just so I can have both in the same file for
now).

> There are certainly some multiple-evaluation macros, but I don't think
> they are associated with core data types. You will not find any in
> pg_list.h for instance. I think it's important that these new forms
> of list be as easy and reliable to use as List is. I'm willing to trade
> off some micro-performance to have that.
Just from what I had in open emacs frames without switching buffers when I read
that email:
Min/Max in c.h and about half of the macros in htup.h (I had the 9.1 tree open
at that point)...

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-09-16 14:36:13 Re: _FORTIFY_SOURCE by default?
Previous Message Gurjeet Singh 2012-09-16 13:49:35 Patch to include c.h