RE: Global shared meta cache

From: "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>
To: 'Amit Langote' <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Global shared meta cache
Date: 2018-11-26 12:12:09
Message-ID: 4E72940DA2BF16479384A86D54D0988A6F3B92BC@G01JPEXMBKW04
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

>From: Ideriha, Takeshi [mailto:ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com]
>Sent: Wednesday, October 3, 2018 3:18 PM
>At this moment this patch only allocates catalog cache header and CatCache data on
>the shared memory area.
On this allocation stuffs I'm trying to handle it in another thread [1] in a broader way.

>The followings are major items I haven't touched:
>- how to treat cache visibility and invalidation coming from transactions including DDL

On this point some of you gave me comment before but at that time I had less knowledge
and couldn't replay them immediately. Sorry for that.

Right now I hit upon two things.
Plan A is that all of the works is done in the shared memory and no local cache is used.
Plan B is that both shared cache and local cache are used.
Maybe based on the discussion several month ago in this thread, plan B would be better.
But there are some variations of plan B so I'd like to hear opinions.

A. Use only shared memory
Because everything should be done inside shared memory it needs same machinery as current DB shared_buffers
That is, handling transaction including DDL in a proper way needs MVCC and cleaning up obsoleted cache needs vacuum.
Taking advantage of MVCC and vacuum would work but it seems to me pretty tough to implement them.
So another option is plan B, which handles version control of cache and clean them up in a different way.

B. Use both shared memory and local memory
Basic policy is that the shared memory keeps the latest version cache as much as possible and each cache has version information (xmin, xmax).
Local cache is a kind of cache of shared one and its lifetime is temporal.

[Search cache]
When a backend wants to use relation or catalog cache in a transaction, it tries to find them in a following order:
1. local cache
2. shared cache
3. disk

At first there is no local cache so it tries to search shared cache and if found loads it into the local memory.
If wanted cache is not found in shared memory, backend fetches it from disk.

[Lifetime of local cache]
When ALTER TABLE/DROP TABLE is issued in a transaction, relevant local cache should be different from the original one.
On this point I'm thinking two cases.
B-1: Create a local cache at the first reference and keep it until transaction ends.
The relevant local cache is updated or deleted when the DROP/ALTER is issued. It's freed when transaction is committed or aborted.
B-2: The lifetime of local cache is during one snapshot. If isolation-level is read-committed, every time the command is issued local cache is deleted.

In case of B-1 sinval messages machinery is necessary to update the local cache, which is same as current machinery.
On the other hand, case B-2 doesn't need sinval message because after one snapshot duration is expired the local cache is deleted.
From another point of view, there is trade-off relation between B-1 and B-2. B-1 would outweigh B-2 in terms of performance
but B-2 would use less memory.

[Invalidation of shared cache]
I'm thinking that invalidating shared cache can be responsible for a backend which wants to see the latest version rather than
one has committed DROP/ALTER command. In my sketch caches has its own version information so transaction can compare its snapshot
with shared cache version and if cache is not wanted one, we can obtain it from disk.

Do you have any thoughts?

[1] https://www.postgresql.org/message-id/flat/4E72940DA2BF16479384A86D54D0988A6F1EE452%40G01JPEXMBKW04
Regards,
Takeshi Ideriha

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2018-11-26 12:32:10 Re: pgsql: Integrate recovery.conf into postgresql.conf
Previous Message Vik Fearing 2018-11-26 12:10:45 Re: New function pg_stat_statements_reset_query() to reset statistics of a specific query