Re: Protect syscache from bloating with negative cache entries

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, Jim(dot)Nasby(at)bluetreble(dot)com, craig(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2018-03-01 18:54:55
Message-ID: 20180301185455.ybyw5yajodjezdoa@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-12-26 18:19:16 +0900, Kyotaro HORIGUCHI wrote:
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -733,6 +733,9 @@ void
> SetCurrentStatementStartTimestamp(void)
> {
> stmtStartTimestamp = GetCurrentTimestamp();
> +
> + /* Set this time stamp as aproximated current time */
> + SetCatCacheClock(stmtStartTimestamp);
> }

Hm.

> + * Remove entries that haven't been accessed for a certain time.
> + *
> + * Sometimes catcache entries are left unremoved for several reasons.

I'm unconvinced that that's ok for positive entries, entirely regardless
of this patch.

> We
> + * cannot allow them to eat up the usable memory and still it is better to
> + * remove entries that are no longer accessed from the perspective of memory
> + * performance ratio. Unfortunately we cannot predict that but we can assume
> + * that entries that are not accessed for long time no longer contribute to
> + * performance.
> + */

This needs polish.

> +static bool
> +CatCacheCleanupOldEntries(CatCache *cp)
> +{
> + int i;
> + int nremoved = 0;
> +#ifdef CATCACHE_STATS
> + int ntotal = 0;
> + int tm[] = {30, 60, 600, 1200, 1800, 0};
> + int cn[6] = {0, 0, 0, 0, 0};
> + int cage[3] = {0, 0, 0};
> +#endif

This doesn't look nice, the names descriptive enough to be self evident,
and there's no comments what these random arrays mean. And some specify
lenght (and have differing number of elements!) and others don't.

> + /* Move all entries from old hash table to new. */
> + for (i = 0; i < cp->cc_nbuckets; i++)
> + {
> + dlist_mutable_iter iter;
> +
> + dlist_foreach_modify(iter, &cp->cc_bucket[i])
> + {
> + CatCTup *ct = dlist_container(CatCTup, cache_elem, iter.cur);
> + long s;
> + int us;
> +
> +
> + TimestampDifference(ct->lastaccess, catcacheclock, &s, &us);
> +
> +#ifdef CATCACHE_STATS
> + {
> + int j;
> +
> + ntotal++;
> + for (j = 0 ; tm[j] != 0 && s > tm[j] ; j++);
> + if (tm[j] == 0) j--;
> + cn[j]++;
> + }
> +#endif

What?

> + /*
> + * Remove entries older than 600 seconds but not recently used.
> + * Entries that are not accessed after creation are removed in 600
> + * seconds, and that has been used several times are removed after
> + * 30 minumtes ignorance. We don't try shrink buckets since they
> + * are not the major part of syscache bloat and they are expected
> + * to be filled shortly again.
> + */
> + if (s > 600)
> + {

So this is hardcoded, without any sort of cache pressure logic? Doesn't
that mean we'll often *severely* degrade performance if a backend is
idle for a while?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-03-01 19:00:52 Re: [PoC PATCH] Parallel dump to /dev/null
Previous Message Andres Freund 2018-03-01 18:43:02 Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)