Re: tbm_lossify causes "unbalanced" hashtables / bitmaps

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: tbm_lossify causes "unbalanced" hashtables / bitmaps
Date: 2016-09-23 22:14:49
Message-ID: 20160923221449.4fvtxhu654k6n3uh@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-09-23 17:40:13 -0400, Tom Lane wrote:
> My idea of an appropriate fix would be to resume the scan at the same
> point where the last scan stopped, and work circularly around the table
> when necessary.

I've played with that idea, and it does help a great deal. Not that
surprisingly, it's better than starting at a random point (which in turn
is better than starting at one end all the time).

> But I'm not sure there is any really good way to do that
> in the dynahash infrastructure. Maybe it'd work to keep the iteration
> state around, but I don't remember how well that interacts with other
> insertions/deletions. What about in your implementation?

It's easy enough to specify a start point. Requires exposing some things
that I don't necessarily want to - that's why I played around with a
random start point - but other than that it's easy to
implement. Internally growing the hashtable would be kind of an issue,
invalidating that point, but a) we're most of the time not growing at
that point anymore b) it'd be quite harmless to start at the wrong
point.

> There's also the point mentioned in the existing comment, that it'd be
> better to go after pages with more bits set first. Not sure of an
> inexpensive way to do that (ie, one that doesn't involve multiple
> scans of the hashtable). But your results suggest that maybe it'd
> be worth making tbm_lossify slower in order to get better results.

It's not easy, I agree.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joy Arulraj 2016-09-23 23:37:07 Re: C++ port of Postgres
Previous Message Tom Lane 2016-09-23 21:40:13 Re: tbm_lossify causes "unbalanced" hashtables / bitmaps