Re: RFC: Improve CPU cache locality of syscache searches

From: "Andres Freund" <andres(at)anarazel(dot)de>
To: "Yura Sokolov" <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: "John Naylor" <john(dot)naylor(at)enterprisedb(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: RFC: Improve CPU cache locality of syscache searches
Date: 2021-08-06 06:11:53
Message-ID: c490b87d-6b59-4f44-a3e7-cb660e2940cb@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Aug 5, 2021, at 22:20, Yura Sokolov wrote:
> Andres Freund писал 2021-08-06 06:49:
> > Hi,
> >
> > On 2021-08-06 06:43:55 +0300, Yura Sokolov wrote:
> >> Why don't use simplehash or something like that? Open-addressing
> >> schemes
> >> show superior cache locality.
> >
> > I thought about that as well - but it doesn't really resolve the
> > question of
> > what we want to store in-line in the hashtable and what not. We can't
> > store
> > the tuples themselves in the hashtable for a myriad of reasons (need
> > pointer
> > stability, they're variably sized, way too large to move around
> > frequently).
> >
> >
> >> Well, simplehash entry will be 24 bytes this way. If simplehash
> >> template
> >> supports external key/element storage, then it could be shrunk to 16
> >> bytes,
> >> and syscache entries will not need dlist_node. (But it doesn't at the
> >> moment).
> >
> > I think storing keys outside of the hashtable entry defeats the purpose
> > of the
> > open addressing, given that they are always checked and that our
> > conflict
> > ratio should be fairly low.
>
> It's opposite: if conflict ratio were high, then key outside of
> hashtable will
> be expensive, since lookup to non-matched key will cost excess memory
> access.
> But with low conflict ratio we will usually hit matched entry at first
> probe.
> And since we will use entry soon, it doesn't matter when it will go to
> CPU L1
> cache: during lookup or during actual usage.

Often enough it does matter, because there will be earlier dependencies on whether a lookup is a cache hit/miss than on the content of the cached tuple.

Regards,

Andres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-08-06 06:12:01 Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Previous Message Amit Kapila 2021-08-06 05:50:38 Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION