Re: RFC: Improve CPU cache locality of syscache searches

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: RFC: Improve CPU cache locality of syscache searches
Date: 2021-08-06 14:48:30
Message-ID: CAFBsxsEtCJkO8vOjfBzfHdsG0TDGjt5CGRE3RG_1+0WOpyw3sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 5, 2021 at 4:12 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I have wondered before whether we should have syscache definitions
generate
> code specific to each syscache definition. I do think that'd give a good
bit
> of performance boost. But I don't see a trivial way to get there without
> notational overhead.
>
> We could define syscaches in a header using a macro that's defined
differently
> in syscache.c than everywhere else. The header would declare a set of
> functions for each syscache, syscache.c would define them to call an
> always_inline function with the relevant constants.
>
> Or perhaps we should move syscache definitions into the pg_*.h headers,
and
> generate the relevant code as part of their processing. That seems like it
> could be nice from a modularity POV alone. And it could do better than the
> current approach, because we could hardcode the types for columns in the
> syscache definition without increasing notational overhead.

I had code laying around to generate the info needed for initialization,
but I apparently deleted it and never emailed it. :-(

If we went that far, I wonder if we could specialize the cc_bucket for the
actual types, rather than just number of Datums, which would further save
on space. More complex, though.

Also, if comparisons got cheaper, maybe we could get away with not storing
the full hash in the bucket in favor of a tag of the 16 highest bits. (The
catctup would still need the full hash for invalidation, at least).

Another idea I came across while researching in-memory key-value stores is
"bulk chaining" -- see [1] for an example. In that case, our example
2-cacheline bucket would have a dlist_head pointing not to the catctups,
but to another bucket. So that throws away my L1/L2 analogy and uses a
chain of buckets, each of which contain multiple sets of
hashes/keys/pointers. That's closer to a normal collision chain, but with
better cache locality. If we did that, I imagine the catctup could dispense
with storing its Datum array and its dlist_node entirely.

[1]
https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-lim.pdf

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-08-06 14:48:40 Re: Alias collision in `refresh materialized view concurrently`
Previous Message tanghy.fnst@fujitsu.com 2021-08-06 14:14:21 [PATCH]Remove obsolete macro CHECKFLOATVAL in btree_gist