Re: embedded list v2

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-15 17:21:44
Message-ID: 17398.1347729704@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote:
>> Well, actually, that just brings us to the main point which is: I do not
>> believe that circular links are a good design choice here.

> I think I have talked about the reasoning on the list before, but here it
> goes: The cases where I primarily used them are cases where the list
> *manipulation* is a considerable part of the expense. In those situations
> having less branches is beneficial and, to my knowledge, cannot be done in
> normal flat lists.
> For the initial user of those lists - the slab allocator for postgres which I
> intend to finish at some point - the move to circular lists instead of
> classical lists was an improvement in the 12-15% range.

Let me make my position clear: I will not accept performance as the sole
figure of merit for this datatype. It also has to be easy and reliable
to use, and the current design is a failure on those dimensions.
This question of ease and reliability of use isn't just academic, since
you guys had trouble converting catcache.c without introducing bugs.
That isn't exactly a positive showing for this design.

Furthermore, one datapoint for one use-case (probably measured on only
one CPU architecture) isn't even a very convincing case for the
performance being better. To take a handy example, if we were to
convert dynahash to use dlists, having to initialize each hash bucket
header this way would probably be a pretty substantial cost for a lot
of hashtable use-cases. We have a lot of short-lived dynahash tables.

> Inhowfar do they make iteration more expensive? ptr != head shouldn't be more
> expensive than !ptr.

That's probably true *as long as the head pointer is available in a
register*. But having to reserve a second register for the loop
mechanics can be a serious loss on register-poor architectures.

This also ties into the other problem, since it's hard to code such
loop control as a macro without multiple evaluation of the list-head
argument. To me that's a show stopper of the first order. I'm not
going to accept a replacement for foreach() that introduces bug hazards
that aren't in foreach().

>> I don't really care. If you can't build it without multiple-evaluation
>> macros, it's too dangerous for a fundamental construct that's meant to
>> be used all over the place. Time to redesign.

> Its not like pg doesn't use any other popularly used macros that have multiple
> evaluation hazarards.

There are certainly some multiple-evaluation macros, but I don't think
they are associated with core data types. You will not find any in
pg_list.h for instance. I think it's important that these new forms
of list be as easy and reliable to use as List is. I'm willing to trade
off some micro-performance to have that.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-09-15 18:06:03 Re: pg_upgrade from 9.1.3 to 9.2 failed
Previous Message Tom Lane 2012-09-15 16:29:25 Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.