Re: Protect syscache from bloating with negative cache entries

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, Jim(dot)Nasby(at)bluetreble(dot)com, craig(at)2ndquadrant(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2018-10-19 08:34:57
Message-ID: 20181019.173457.68080786.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello. Thank you for the comment.

At Thu, 4 Oct 2018 04:27:04 +0000, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com> wrote in <4E72940DA2BF16479384A86D54D0988A6F1BCB6F(at)G01JPEXMBKW04>
> >As a *PoC*, in the attached patch (which applies to current master), size of CTups are
> >counted as the catcache size.
> >
> >It also provides pg_catcache_size system view just to give a rough idea of how such
> >view looks. I'll consider more on that but do you have any opinion on this?
> >
...
> Great! I like this view.
> One of the extreme idea would be adding all the members printed by CatCachePrintStats(),
> which is only enabled with -DCATCACHE_STATS at this moment.
> All of the members seems too much for customers who tries to change the cache limit size
> But it may be some of the members are useful because for example cc_hits would indicate that current
> cache limit size is too small.

The attached introduces four features below. (But the features on
relcache and plancache are omitted).

1. syscache stats collector (in 0002)

Records syscache status consists of the same columns above and
"ageclass" information. We could somehow triggering a stats
report with signal but we don't want take/send/write the
statistics in signal handler. Instead, it is turned on by setting
track_syscache_usage_interval to a positive number in
milliseconds.

2. pg_stat_syscache view. (in 0002)

This view shows catcache statistics. Statistics is taken only on
the backends where syscache tracking is active.

> pid | application_name | relname | cache_name | size | ageclass | nentries
> ------+------------------+----------------+-----------------------------------+----------+-------------------------+---------------------------
> 9984 | psql | pg_statistic | pg_statistic_relid_att_inh_index | 12676096 | {30,60,600,1200,1800,0} | {17660,17310,55870,0,0,0}

Age class is the basis of catcache truncation mechanism and shows
the distribution based on elapsed time since last access. As I
didn't came up an appropriate way, it is represented as two
arrays. Ageclass stores maximum age for each class in
seconds. Nentries holds entry numbers correnponding to the same
element in ageclass. In the above example,

age class : # of entries in the cache
up to 30s : 17660
up to 60s : 17310
up to 600s : 55870
up to 1200s : 0
up to 1800s : 0
more longer : 0

The ageclass is {0, 0.05, 0.1, 1, 2, 3}th multiples of
cache_prune_min_age on the backend.

3. non-transactional GUC setting (in 0003)

It allows setting GUC variable set by the action
GUC_ACTION_NONXACT(the name requires condieration) survive beyond
rollback. It is required by remote guc setting to work
sanely. Without the feature a remote-set value within a trasction
will disappear involved in rollback. The only local interface for
the NONXACT action is set_config(name, value, is_local=false,
is_nonxact = true). pg_set_backend_guc() below works on this
feature.

4. pg_set_backend_guc() function.

Of course syscache statistics recording consumes significant
amount of time so it cannot be turned on usually. On the other
hand since this feature is turned on by GUC, it is needed to grab
the active client connection to turn on/off the feature(but we
cannot). Instead, I provided a means to change GUC variables in
another backend.

pg_set_backend_guc(pid, name, value) sets the GUC variable "name"
on the backend "pid" to "value".

With the above tools, we can inspect catcache statistics of
seemingly bloated process.

A. Find a bloated process pid using ps or something.

B. Turn on syscache stats on the process.
=# select pg_set_backend_guc(9984, 'track_syscache_usage_interval', '10000');

C. Examine the statitics.

=# select pid, relname, cache_name, size from pg_stat_syscache order by size desc limit 3;
pid | relname | cache_name | size
------+--------------+----------------------------------+----------
9984 | pg_statistic | pg_statistic_relid_att_inh_index | 32154112
9984 | pg_cast | pg_cast_source_target_index | 4096
9984 | pg_operator | pg_operator_oprname_l_r_n_index | 4096

=# select * from pg_stat_syscache where cache_name = 'pg_statistic_relid_att_inh_index'::regclass;
-[ RECORD 1 ]---------------------------------
pid | 9984
relname | pg_statistic
cache_name | pg_statistic_relid_att_inh_index
size | 11026176
ntuples | 77950
searches | 77950
hits | 0
neg_hits | 0
ageclass | {30,60,600,1200,1800,0}
nentries | {17630,16950,43370,0,0,0}
last_update | 2018-10-17 15:58:19.738164+09

> >> Another option is that users only specify the total memory target size
> >> and postgres dynamically change each CatCache memory target size according to a
> >certain metric.
> >> (, which still seems difficult and expensive to develop per benefit)
> >> What do you think about this?
> >
> >Given that few caches bloat at once, it's effect is not so different from the current
> >design.
> Yes agreed.
>
> >> As you commented here, guc variable syscache_memory_target and
> >> syscache_prune_min_age are used for both syscache and relcache (HTAB), right?
> >
> >Right, just not to add knobs for unclear reasons. Since ...
> >
> >> Do syscache and relcache have the similar amount of memory usage?
> >
> >They may be different but would make not so much in the case of cache bloat.
> >> If not, I'm thinking that introducing separate guc variable would be fine.
> >> So as syscache_prune_min_age.
> >
> >I implemented that so that it is easily replaceable in case, but I'm not sure separating
> >them makes significant difference..
> Maybe I was overthinking mixing my development.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch text/x-patch 15.2 KB
0002-Syscache-usage-tracking-feature.patch text/x-patch 35.1 KB
0003-Remote-GUC-setting-feature-and-non-xact-GUC-config.patch text/x-patch 45.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mehman Jafarov 2018-10-19 09:03:12 Problem about partitioned table
Previous Message Michael Paquier 2018-10-19 07:47:29 Re: partition tree inspection functions