Re: Protect syscache from bloating with negative cache entries

From: "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "michael(dot)paquier(at)gmail(dot)com" <michael(dot)paquier(at)gmail(dot)com>, "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "craig(at)2ndquadrant(dot)com" <craig(at)2ndquadrant(dot)com>
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-01-19 01:09:41
Message-ID: 20190119010941.6ruftewah7t3k3yk@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-01-18 19:57:03 -0500, Robert Haas wrote:
> On Fri, Jan 18, 2019 at 4:23 PM andres(at)anarazel(dot)de <andres(at)anarazel(dot)de> wrote:
> > My proposal for this was to attach a 'generation' to cache entries. Upon
> > access cache entries are marked to be of the current
> > generation. Whenever existing memory isn't sufficient for further cache
> > entries and, on a less frequent schedule, triggered by a timer, the
> > cache generation is increased and th new generation's "creation time" is
> > measured. Then generations that are older than a certain threshold are
> > purged, and if there are any, the entries of the purged generation are
> > removed from the caches using a sequential scan through the cache.
> >
> > This outline achieves:
> > - no additional time measurements in hot code paths
> > - no need for a sequential scan of the entire cache when no generations
> > are too old
> > - both size and time limits can be implemented reasonably cheaply
> > - overhead when feature disabled should be close to zero
>
> Seems generally reasonable. The "whenever existing memory isn't
> sufficient for further cache entries" part I'm not sure about.
> Couldn't that trigger very frequently and prevent necessary cache size
> growth?

I'm thinking it'd just trigger a new generation, with it's associated
"creation" time (which is cheap to acquire in comparison to creating a
number of cache entries) . Depending on settings or just code policy we
can decide up to which generation to prune the cache, using that
creation time. I'd imagine that we'd have some default cache-pruning
time in the minutes, and for workloads where relevant one can make
sizing configurations more aggressive - or something like that.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-01-19 01:12:00 Re: pgsql: Restrict the use of temporary namespace in two-phase transaction
Previous Message Peter Geoghegan 2019-01-19 01:06:26 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)