Re: Protect syscache from bloating with negative cache entries

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: GavinFlower(at)archidevsys(dot)co(dot)nz
Cc: bruce(at)momjian(dot)us, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, andres(at)anarazel(dot)de, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, Jim(dot)Nasby(at)bluetreble(dot)com, craig(at)2ndquadrant(dot)com
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-01-18 08:33:30
Message-ID: 20190118.173330.139175539.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

At Fri, 18 Jan 2019 16:39:29 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20190118(dot)163929(dot)229869562(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> Hello.
>
> At Fri, 18 Jan 2019 11:46:03 +1300, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz> wrote in <4e62e6b7-0ffb-54ae-3757-5583fcca38c0(at)archidevsys(dot)co(dot)nz>
> > On 18/01/2019 08:48, Bruce Momjian wrote:
> > > Unfortunately, because we have not found something we are happy with,
> > > we
> > > have done nothing. I agree LRU can be expensive. What if we do some
> > > kind of clock sweep and expiration like we do for shared buffers? I
>
> So, it doesn't use LRU but a kind of clock-sweep method. If it
> finds the size is about to exceed the threshold by
> resiz(doubl)ing when the current hash is filled up, it tries to
> trim away the entries that are left for a duration corresponding
> to usage count. This is not a hard limit but seems to be a good
> compromise.
>
> > > think the trick is figuring how frequently to do the sweep. What if
> > > we
> > > mark entries as unused every 10 queries, mark them as used on first
> > > use,
> > > and delete cache entries that have not be used in the past 10 queries.
>
> As above, it tires pruning at every resizing time. So this adds
> complexity to the frequent paths only by setting last accessed
> time and incrementing access counter. It scans the whole hash at
> resize time but it doesn't add much comparing to resizing itself.
>
> > If you take that approach, then this number should be configurable. 
> > What if I had 12 common queries I used in rotation?
>
> This basically has two knobs. The minimum hash size to do the
> pruning and idle time before reaping unused entries, per
> catcache.

This is the rebased version.

0001: catcache pruning

syscache_memory_target controls per-cache basis minimum size
where this starts pruning.

syscache_prune_min_time controls minimum idle duration until an
catcache entry is removed.

0002: catcache statistics view

track_syscache_usage_interval is the interval statitics of
catcache is collected.

pg_stat_syscache is the view that shows the statistics.

0003: Remote GUC setting

It is independent from the above two, and heavily arguable.

pg_set_backend_config(pid, name, value) changes the GUC <name> on
the backend with <pid> to <value>.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v7-0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch text/x-patch 15.2 KB
v7-0002-Syscache-usage-tracking-feature.patch text/x-patch 37.2 KB
v7-0003-Remote-GUC-setting-feature-and-non-xact-GUC-config.patch text/x-patch 43.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-01-18 08:37:26 Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD
Previous Message Laurenz Albe 2019-01-18 08:27:04 Re: Libpq support to connect to standby server as priority