RE: Protect syscache from bloating with negative cache entries

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Kyotaro HORIGUCHI' <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "michael(dot)paquier(at)gmail(dot)com" <michael(dot)paquier(at)gmail(dot)com>, "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "Jim(dot)Nasby(at)bluetreble(dot)com" <Jim(dot)Nasby(at)bluetreble(dot)com>, "craig(at)2ndquadrant(dot)com" <craig(at)2ndquadrant(dot)com>
Subject: RE: Protect syscache from bloating with negative cache entries
Date: 2019-01-15 02:52:38
Message-ID: 0A3221C70F24FB45833433255569204D1FB67733@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> I'm really disappointed by the direction this thread is going in.
> The latest patches add an enormous amount of mechanism, and user-visible
> complexity, to do something that we learned was a bad idea decades ago.
> Putting a limit on the size of the syscaches doesn't accomplish anything
> except to add cycles if your cache working set is below the limit, or make
> performance fall off a cliff if it's above the limit. I don't think there's
> any reason to believe that making it more complicated will avoid that
> problem.
> What does seem promising is something similar to Horiguchi-san's original
> patches all the way back at
> hi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp

> so I'd been thinking about ways to fix that case in particular.

You're suggesting to go back to the original issue (bloat by negative cache entries) and give simpler solution to it once, aren't you? That may be the way to go.

But the syscache/relcache bloat still remains a problem, when there are many live tables and application connections. Would you agree to solve this in some way? I thought Horiguchi-san's latest patches would solve this and the negative entries. Can we consider that his patch and yours are orthogonal, i.e., we can pursue Horiguchi-san's patch after yours is committed?

(As you said, some parts of Horiguchi-san's patches may be made simpler. For example, the ability to change another session's GUC variable can be discussed in a separate thread.)

I think we need some limit to the size of the relcache, syscache, and plancache. Oracle and MySQL both have it, using LRU to evict less frequently used entries. You seem to be concerned about the LRU management based on your experience, but would it really cost so much as long as each postgres process can change the LRU list without coordination with other backends now? Could you share your experience?

FYI, Oracle provides one parameter, shared_pool_size, that determine the size of a memory area that contains SQL plans and various dictionary objects. Oracle decides how to divide the area among constituents. So it could be possible that one component (e.g. table/index metadata) is short of space, and another (e.g. SQL plans) has free space. Oracle provides a system view to see the free space and hit/miss of each component. If one component suffers from memory shortage, the user increases shared_pool_size. This is similar to what Horiguchi-san is proposing.

MySQL enables fine-tuning of each component. It provides the size parameters for six memory partitions of the dictionary object cache, and the usage statistics of those partitions through the Performance Schema.

tablespace definition cache
schema definition cache
table definition cache
stored program definition cache
character set definition cache
collation definition cache

I wonder whether we can group existing relcache/syscache entries like this.

14.4 Dictionary Object Cache
The dictionary object cache is a shared global cache that stores previously accessed data dictionary objects in memory to enable object reuse and minimize disk I/O. Similar to other cache mechanisms used by MySQL, the dictionary object cache uses an LRU-based eviction strategy to evict least recently used objects from memory.

The dictionary object cache comprises cache partitions that store different object types. Some cache partition size limits are configurable, whereas others are hardcoded.
-------------------------------------------------- How MySQL Uses Memory
MySQL requires memory and descriptors for the table cache.

For InnoDB, table_definition_cache acts as a soft limit for the number of open table instances in the InnoDB data dictionary cache. If the number of open table instances exceeds the table_definition_cache setting, the LRU mechanism begins to mark table instances for eviction and eventually removes them from the data dictionary cache. The limit helps address situations in which significant amounts of memory would be used to cache rarely used table instances until the next server restart.

Takayuki Tsunakawa

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-15 02:53:17 Re: Libpq support to connect to standby server as priority
Previous Message Michael Paquier 2019-01-15 02:48:55 Re: Libpq support to connect to standby server as priority