Re: Protect syscache from bloating with negative cache entries

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, Jim(dot)Nasby(at)bluetreble(dot)com, craig(at)2ndquadrant(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2018-09-13 12:40:59
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello. Thank you for looking this.

At Wed, 12 Sep 2018 05:16:52 +0000, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com> wrote in <4E72940DA2BF16479384A86D54D0988A6F197012(at)G01JPEXMBKW04>
> Hi,
> >Subject: Re: Protect syscache from bloating with negative cache entries
> >
> >Hello. The previous v4 patchset was just broken.
> >Somehow the 0004 was merged into the 0003 and applying 0004 results in failure. I
> >removed 0004 part from the 0003 and rebased and repost it.
> I have some questions about syscache and relcache pruning
> though they may be discussed at upper thread or out of point.
> Can I confirm about catcache pruning?
> syscache_memory_target is the max figure per CatCache.
> (Any CatCache has the same max value.)
> So the total max size of catalog caches is estimated around or
> slightly more than # of SysCache array times syscache_memory_target.


> If correct, I'm thinking writing down the above estimation to the document
> would help db administrators with estimation of memory usage.
> Current description might lead misunderstanding that syscache_memory_target
> is the total size of catalog cache in my impression.

Honestly I'm not sure that is the right design. Howerver, I don't
think providing such formula to users helps users, since they
don't know exactly how many CatCaches and brothres live in their
server and it is a soft limit, and finally only few or just one
catalogs can reach the limit.

The current design based on the assumption that we would have
only one extremely-growable cache in one use case.

> Related to the above I just thought changing sysycache_memory_target per CatCache
> would make memory usage more efficient.

We could easily have per-cache settings in CatCache, but how do
we provide the knobs for them? I can guess only too much
solutions for that.

> Though I haven't checked if there's a case that each system catalog cache memory usage varies largely,
> pg_class cache might need more memory than others and others might need less.
> But it would be difficult for users to check each CatCache memory usage and tune it
> because right now postgresql hasn't provided a handy way to check them.

I supposed that this is used without such a means. Someone
suffers syscache bloat just can set this GUC to avoid the
bloat. End.

Apart from that, in the current patch, syscache_memory_target is
not exact at all in the first place to avoid overhead to count
the correct size. The major difference comes from the size of
cache tuple itself. But I came to think it is too much to omit.

As a *PoC*, in the attached patch (which applies to current
master), size of CTups are counted as the catcache size.

It also provides pg_catcache_size system view just to give a
rough idea of how such view looks. I'll consider more on that but
do you have any opinion on this?

=# select relid::regclass, indid::regclass, size from pg_syscache_sizes order by size desc;
relid | indid | size
pg_class | pg_class_oid_index | 131072
pg_class | pg_class_relname_nsp_index | 131072
pg_cast | pg_cast_source_target_index | 5504
pg_operator | pg_operator_oprname_l_r_n_index | 4096
pg_statistic | pg_statistic_relid_att_inh_index | 2048
pg_proc | pg_proc_proname_args_nsp_index | 2048

> Another option is that users only specify the total memory target size and postgres
> dynamically change each CatCache memory target size according to a certain metric.
> (, which still seems difficult and expensive to develop per benefit)
> What do you think about this?

Given that few caches bloat at once, it's effect is not so
different from the current design.

> As you commented here, guc variable syscache_memory_target and
> syscache_prune_min_age are used for both syscache and relcache (HTAB), right?

Right, just not to add knobs for unclear reasons. Since ...

> Do syscache and relcache have the similar amount of memory usage?

They may be different but would make not so much in the case of
cache bloat.

> If not, I'm thinking that introducing separate guc variable would be fine.
> So as syscache_prune_min_age.

I implemented that so that it is easily replaceable in case, but
I'm not sure separating them makes significant difference..

Thanks for the opinion, I'll put consideration on this more.


Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v5-2-PoC-0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch text/x-patch 28.1 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Kuzmenkov 2018-09-13 13:01:13 Re: Index Skip Scan
Previous Message Stephen Frost 2018-09-13 11:45:49 Re: Collation versioning