Re: Use simplehash.h instead of dynahash in SMgr

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowley(at)gmail(dot)com>
Subject: Re: Use simplehash.h instead of dynahash in SMgr
Date: 2021-10-06 16:15:38
Message-ID: 0a0d8e9bde49129c643ce76d4e48424d2567c6cc.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Good day, David and all.

В Вт, 05/10/2021 в 11:07 +1300, David Rowley пишет:
> On Mon, 4 Oct 2021 at 20:37, Jaime Casanova
> <jcasanov(at)systemguards(dot)com(dot)ec> wrote:
> > Based on your comments I will mark this patch as withdrawn at midday
> > of
> > my monday unless someone objects to that.
>
> I really think we need a hash table implementation that's faster than
> dynahash and supports stable pointers to elements (simplehash does not
> have stable pointers). I think withdrawing this won't help us move
> towards getting that.

Agree with you. I believe densehash could replace both dynahash and
simplehash. Shared memory usages of dynahash should be reworked to
other less dynamic hash structure. So there should be densehash for
local hashes and statichash for static shared memory.

densehash slight slowness compared to simplehash in some operations
doesn't worth keeping simplehash beside densehash.

> Thomas voiced his concerns here about having an extra hash table
> implementation and then also concerns that I've coded the hash table
> code to be fast to iterate over the hashed items. To be honest, I
> think both Andres and Thomas must be misunderstanding the bitmap part.
> I get the impression that they both think the bitmap is solely there
> to make interations faster, but in reality it's primarily there as a
> compact freelist and can also be used to make iterations over sparsely
> populated tables fast. For the freelist we look for 0-bits, and we
> look for 1-bits during iteration.

I think this part is overengineered. More below.

> I think I'd much rather talk about the concerns here than just
> withdraw this. Even if what I have today just serves as something to
> aid discussion.
>
> It would also be good to get the points Andres raised with me off-list
> on this thread. I think his primary concern was that bitmaps are
> slow, but I don't really think maintaining full pointers into freed
> items is going to improve the performance of this.
>
> David

First on "quirks" in the patch I was able to see:

DH_NEXT_ZEROBIT:

DH_BITMAP_WORD mask = (~(DH_BITMAP_WORD) 0) << DH_BITNUM(prevbit);
DH_BITMAP_WORD word = ~(words[wordnum] & mask); /* flip bits */

really should be

DH_BITMAP_WORD mask = (~(DH_BITMAP_WORD) 0) << DH_BITNUM(prevbit);
DH_BITMAP_WORD word = (~words[wordnum]) & mask; /* flip bits */

But it doesn't harm because DH_NEXT_ZEROBIT is always called with
`prevbit = -1`, which is incremented to `0`. Therefore `mask` is always
`0xffff...ff`.

DH_INDEX_TO_ELEMENT

/* ensure this segment is marked as used */
should be
/* ensure this item is marked as used in the segment */

DH_GET_NEXT_UNUSED_ENTRY

/* find the first segment with an unused item */
while (seg != NULL && seg->nitems == DH_ITEMS_PER_SEGMENT)
seg = tb->segments[++segidx];

No protection for `++segidx <= tb->nsegments` . I understand, it could
not happen due to `grow_threshold` is always lesser than
`nsegment * DH_ITEMS_PER_SEGMENT`. But at least comment should be
leaved about legality of absence of check.

Now architecture notes:

I don't believe there is need for configurable DH_ITEMS_PER_SEGMENT. I
don't event believe it should be not equal to 16 (or 8). Then segment
needs only one `used_items` word, which simplifies code a lot.
There is no much difference in overhead between 1/16 and 1/256.

And then I believe, segment doesn't need both `nitems` and `used_items`.
Condition "segment is full" will be equal to `used_items == 0xffff`.

Next, I think it is better to make real free list instead of looping to
search such one. Ie add `uint32 DH_SEGMENT->next` field and maintain
list starting from `first_free_segment`.
If concern were "allocate from lower-numbered segments first", than min-
heap could be created. It is possible to create very efficient non-
balanced "binary heap" with just two fields (`uint32 left, right`).
Algorithmic PoC in Ruby language is attached.

There is also allocation concern: AllocSet tends to allocate in power2
sizes. Use of power2 segments with header (nitems/used_items) certainly
will lead to wasted 2x space on every segment if element size is also
power2, and a bit lesser for other element sizes.
There could be two workarounds:
- make segment a bit less capable (15 elements instead of 16)
- move header from segment itself to `DH_TYPE->segments` array.
I think, second option is more prefered:
- `DH_TYPE->segments[x]` inevitable accessed on every operation,
therefore why not store some info here?
- if nitems/used_items will be in `DH_TYPE->segments[x]`, then
hashtable iteration doesn't need bitmap at all - there will be no need
in `DH_TYPE->used_segments` bitmap. Absence of this bitmap will
reduce overhead on usual operations (insert/delete) as well.

Hope I was useful.

regards

Yura Sokolov
y(dot)sokolov(at)postgrespro(dot)ru
funny(dot)falcon(at)gmail(dot)com

Attachment Content-Type Size
heap.rb application/x-ruby 1.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-10-06 16:20:37 Re: Role Self-Administration
Previous Message Robert Haas 2021-10-06 16:09:36 Re: Running tests under valgrind is getting slower at an alarming pace