Re: Protect syscache from bloating with negative cache entries

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: andres(at)anarazel(dot)de
Cc: robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, bruce(at)momjian(dot)us, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, craig(at)2ndquadrant(dot)com
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-01-21 07:48:02
Message-ID: 20190121.164802.81311236.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Fri, 18 Jan 2019 17:09:41 -0800, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de> wrote in <20190119010941(dot)6ruftewah7t3k3yk(at)alap3(dot)anarazel(dot)de>
> Hi,
>
> On 2019-01-18 19:57:03 -0500, Robert Haas wrote:
> > On Fri, Jan 18, 2019 at 4:23 PM andres(at)anarazel(dot)de <andres(at)anarazel(dot)de> wrote:
> > > My proposal for this was to attach a 'generation' to cache entries. Upon
> > > access cache entries are marked to be of the current
> > > generation. Whenever existing memory isn't sufficient for further cache
> > > entries and, on a less frequent schedule, triggered by a timer, the
> > > cache generation is increased and th new generation's "creation time" is
> > > measured. Then generations that are older than a certain threshold are
> > > purged, and if there are any, the entries of the purged generation are
> > > removed from the caches using a sequential scan through the cache.
> > >
> > > This outline achieves:
> > > - no additional time measurements in hot code paths

It is caused at every transaction start time and stored in
TimestampTz in this patch. No additional time measurement exists
already but cache puruing won't happen if a transaction lives for
a long time. Time-driven generation value, maybe with 10s-1min
fixed interval, is a possible option.

> > > - no need for a sequential scan of the entire cache when no generations
> > > are too old

This patch didn't precheck against the oldest generation, but it
can be easily calculated. (But doesn't base on the creation time
but on the last-access time.) (Attached applies over the
v7-0001-Remove-entries-..patch)

Using generation time, entries are purged even if it is recently
accessed. I think last-accessed time is more sutable for the
purpse. On the other hand using last-accessed time, the oldest
generation can be stale by later access.

> > > - both size and time limits can be implemented reasonably cheaply
> > > - overhead when feature disabled should be close to zero

Overhead when disabled is already nothing since scanning is
inhibited when cache_prune_min_age is a negative value.

> > Seems generally reasonable. The "whenever existing memory isn't
> > sufficient for further cache entries" part I'm not sure about.
> > Couldn't that trigger very frequently and prevent necessary cache size
> > growth?
>
> I'm thinking it'd just trigger a new generation, with it's associated
> "creation" time (which is cheap to acquire in comparison to creating a
> number of cache entries) . Depending on settings or just code policy we
> can decide up to which generation to prune the cache, using that
> creation time. I'd imagine that we'd have some default cache-pruning
> time in the minutes, and for workloads where relevant one can make
> sizing configurations more aggressive - or something like that.

The current patch uses last-accesed time by non-gettimeofday()
method. The genreation is fixed up to 3 and infrequently-accessed
entries are removed sooner. Generation interval is determined by
cache_prune_min_age.

Although this doesn't put a hard cap on memory usage, it is
indirectly and softly limited by the cache_prune_min_age and
cache_memory_target, which determins how large a cache can grow
until pruning happens. They are per-cache basis.

If we prefer to set a budget on all the syschaches (or even
including other caches), it would be more complex.

regares.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
add_v7-0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch text/x-patch 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2019-01-21 08:17:53 Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0
Previous Message Tsunakawa, Takayuki 2019-01-21 07:12:41 RE: Protect syscache from bloating with negative cache entries