RE: Global shared meta cache

From: "ideriha(dot)takeshi(at)fujitsu(dot)com" <ideriha(dot)takeshi(at)fujitsu(dot)com>
To: "ideriha(dot)takeshi(at)fujitsu(dot)com" <ideriha(dot)takeshi(at)fujitsu(dot)com>, 'Konstantin Knizhnik' <k(dot)knizhnik(at)postgrespro(dot)ru>, 'Amit Langote' <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, 'Thomas Munro' <thomas(dot)munro(at)gmail(dot)com>
Subject: RE: Global shared meta cache
Date: 2019-10-09 06:06:45
Message-ID: OSAPR01MB19852E61F6B1B17F8794D26FEA950@OSAPR01MB1985.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Konstantin

>>From: Konstantin Knizhnik [mailto:k(dot)knizhnik(at)postgrespro(dot)ru]
>>I do not completely understand from your description when are are going
>>to evict entry from local cache?
>>Just once transaction is committed? I think it will be more efficient
>>to also specify memory threshold for local cache size and use LRU or
>>some other eviction policy to remove data from local cache.
>>So if working set (accessed relations) fits in local cache limit, there
>>will be no performance penalty comparing with current implementation.
>>There should be completely on difference on pgbench or other benchmarks
>>with relatively small number of relations.
>>
>>If entry is not found in local cache, then we should look for it in
>>global cache and in case of double cache miss - read it from the disk.
>>I do not completely understand why we need to store references to
>>global cache entries in local cache and use reference counters for global cache
>entries.
>>Why we can not maintain just two independent caches?
>>
>>While there are really databases with hundreds and even thousands of
>>tables, application is still used to work with only some small subset of them.
>>So I think that "working set" can still fit in memory. This is why I
>>think that in case of local cache miss and global cache hit, we should
>>copy data from global cache to local cache to make it possible to access it in future
>without any sycnhronization.
>>
>>As far as we need to keep all uncommitted data in local cache, there is
>>still a chance of local memory overflow (if some transaction creates or
>>alters too much number of tables).
>>But I think that it is very exotic and rare use case. The problem with
>>memory overflow usually takes place if we have large number of
>>backends, each maintaining its own catalog cache.
>>So I think that we should have "soft" limit for local cache and "hard"
>>limit for global cache.
>
>Oh, I didn't come up this idea at all. So local cache is sort of 1st cache and global cache
>is second cache. That sounds great.
>It would be good for performance and also setting two guc parameter for limiting local
>cache and global cache gives complete memory control for DBA.
>Yeah, uncommitted data should be in local but it's the only exception.
>No need to keep track of reference to global cache from local cache header seems less
>complex for implementation. I'll look into the design.

(After sleeping on it)
What happens if there is a cache miss in local memory and it's found in global?
One possible way is to copy the found global cache into local memory. If so,
I'm just anxious about the cost of memcpy. Another way is, for example,
leaving the global cache and not copying it into local memory. In this case,
every time searching the global cache seems expensive because we need to
get lock for at least the partition of hash table.

The architecture that the local cache holding the reference to global cache
(strictly speaking, holding the pointer to pointer to global cache ) is complex
but once a process searches global cache, after that it can get global cache by
checking the reference is still valid and traversing some pointers.

Regards,
Takeshi Ideriha

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-10-09 06:07:04 Re: Ordering of header file inclusion
Previous Message Fujii Masao 2019-10-09 05:48:50 First WAL segment file that initdb creates