Re: Global shared meta cache

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, 'Amit Langote' <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, 'Thomas Munro' <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Global shared meta cache
Date: 2019-08-01 10:00:39
Message-ID: 48ba23a7-6104-c661-4b49-d0de05476679@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Takeshi-san,

I am sorry for late response - I just waited new version of the patch
from you for review.
I read your last proposal and it seems to be very reasonable.
From my point of view we can not reach acceptable level of performance
if we do not have local cache at all.
So, as you proposed, we should maintain local cache for uncommitted data.

I think that size of global cache should be limited (you have introduced
GUC for it).
In principle it is possible to use dynamic shared memory and have
unlimited global cache.
But I do not see much sense in it.

I do not completely understand from your description when are are going
to evict entry from local cache?
Just once transaction is committed? I think it will be more efficient to
also specify memory threshold for local cache size
and use LRU or some other eviction policy to remove data from local cache.

So if working set (accessed relations) fits in local cache limit, there
will be no performance penalty comparing with current implementation.
There should be completely on difference on pgbench or other benchmarks
with relatively small number of relations.

If entry is not found in local cache, then we should look for it in
global cache and in case of double cache miss - read it from the disk.
I do not completely understand why we need to store references to global
cache entries in local cache and use reference counters for global cache
entries.
Why we can not maintain just two independent caches?

While there are really databases with hundreds and even thousands of
tables, application is still used to work with only some small subset of
them.
So I think that "working set" can still fit in memory.  This is why I
think that in case of local cache miss and global cache hit, we should
copy data from global cache to local cache
to make it possible to access it in future without any sycnhronization.

As far as we need to keep all uncommitted data in local cache, there is
still a chance of local memory overflow (if some transaction creates or
alters too much number of tables).
But I think that it is very exotic and rare use case. The problem with
memory overflow usually takes place if we have large number of backends,
each maintaining its own  catalog cache.
So I think that we should have "soft" limit for local cache and "hard"
limit for global cache.

I didn't think much about cache invalidation. I read your proposal, but
frankly speaking do not understand why it should be so complicated.
Why we can't immediately invalidate entry in global cache and lazily (as
it is done now using invalidation signals) invalidate local caches?

On 26.06.2019 9:23, Ideriha, Takeshi wrote:
> Hi, everyone.
>
>> From: Ideriha, Takeshi [mailto:ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com]
>> My current thoughts:
>> - Each catcache has (maybe partial) HeapTupleHeader
>> - put every catcache on shared memory and no local catcache
>> - but catcache for aborted tuple is not put on shared memory
>> - Hash table exists per kind of CatCache
>> - These hash tables exists for each database and shared
>> - e.g) there is a hash table for pg_class of a DB
> I talked about shared CatCache (SysCache) with Thomas at PGCon and he
> suggested using sinval to control cache visibility instead of xid.
> Base on this I've changed my design. I'll send some PoC patch in a week
> but share my idea beforehand. I'm sorry this email is too long to read
> but I'm happy if you have some comments.
>
> Basically I won't make shared catcache as default, make it as option.
>
> Both local and shared memory has hash tables of catcache. A shared hash
> entry is catctup itself and a local hash entry is a pointer to the
> shared catctup. Actually, local hash entry does not hold a direct pointer
> but points to a handle of shared catctup. The handle points to shared
> catctup and is located in shared memory. This is intended to avoid
> dangling pointer of local hash entry due to eviction of shared catctup
> by LRU. ( The detail about LRU will be written in another email because
> I'll implement it later.)
>
> * Search and Insert
> Current postgres searches (local) hash table and if it's missed, search
> the actual catalog (shared buffer and disk) and build the cache; build
> the negative cache if not found.
>
> In new architecture, if cache is not found in local hash table, postgres
> tries to search shared one before consulting shared buffer. Here is a
> detail. To begin with, postgres looks up the pointer in local hash
> table. If it's found, it references the pointer and gets catctup. If
> not, it searches the shared hash table and gets catctup and insert
> its pointer into local hash table if the catctup is found. If it doesn't
> exist in shared hash table either, postgres searches actual catalog and
> build the cache and in most cases insert it into shared hash table
> and its pointer to local one. The exception case is that the cache
> is made from uncommitted catalog tuple, which must not be seen from
> other process. So an uncommitted cache is built in local memory and
> pushed directly into local table but not shared one. Lastly, if there
> is no tuple we're looking for, put negative tuple into shared hash table.
>
> * Invalidation and visibility control
> Now let's talk about invalidation. Current cache invalidation is based
> on local and shared invalidation queue (sinval). When transaction is
> committed, sinval msg is queued into shared one. Other processes read and
> process sinval msgs at their own timing.
>
> In shared catcache, I follow the current sinval in most parts. But I'll
> change the action when sinval msg is queued up and read by a process.
> When messages are added to shared queue, identify corresponding shared
> caches (matched by hash value) and turn their "obsolete flag" on. When
> sinval msg is read by a process, each process deletes the local hash
> entries (pointer to handler). Each process can see a shared catctup as
> long as its pointer (local entry) is valid. Because sinval msgs are not
> processed yet, it's ok to keep seeing the pointer to possibly old
> cache. After local entry is invalidated, its local process tries
> to search shared hash table to always find a catctup whose obsolete flag
> is off. The process can see the right shared cache after invalidation
> messages are read because it checks the obsolete flag and also
> uncommitted cache never exists in shared memory at all.
>
> There is a subtle thing here. Always finding a shared catctup without
> obsolete mark assumes that the process already read the sinval msgs. So
> before trying to search shared table, I make the process read sinval msg.
> After it's read, local cache status becomes consistent with the action
> to get a new cache. This reading timing is almost same as current postgres
> behavior because it's happened after local cache miss both in current
> design and mine. After cache miss in current design, a process opens
> the relation and gets a heavyweight lock. At this time, in fact, it reads
> the sinval msgs. (These things are well summarized in talking by Robert
> Haas at PGCon[1]).
>
> Lastly, we need to invalidate a shared catctup itself at some point. But
> we cannot delete is as long as someone sees it. So I'll introduce
> refcounter. It's increased or decreased at the same timing when
> current postgres manipulates the local refcounter of catctup and catclist
> to avoid catctup is deleted while catclist is used or vice versa (that
> is SearchCatCache/RelaseCatCache). So shared catctup is deleted when
> its shared refcount becomes zero and obsolete flag is on. Once it's
> vanished from shared cache, the obsolete cache never comes back again
> because a process which tries to get cache but fails in shared hash table
> already read the sinval messages (in any case it reads them when opening
> a table and taking a lock).
>
>
> I'll make a PoC aside from performance issue at first and use
> SharedMemoryContext (ShmContext) [2], which I'm making to allocate/free
> shared items via palloc/pfree.
>
> [1] https://www.pgcon.org/2019/schedule/attachments/548_Challenges%20of%20Concurrent%20DDL.pdf
> [2] https://commitfest.postgresql.org/23/2166/
>
> ---
> Regards,
> Takeshi Ideriha
Hi

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-08-01 10:03:51 Re: \describe*
Previous Message Thomas Munro 2019-08-01 09:53:37 Re: Proposal for Signal Detection Refactoring