Re: Protect syscache from bloating with negative cache entries

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "michael(dot)paquier(at)gmail(dot)com" <michael(dot)paquier(at)gmail(dot)com>, "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "craig(at)2ndquadrant(dot)com" <craig(at)2ndquadrant(dot)com>
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-01-18 20:50:21
Message-ID: CA+TgmoaQVtw=D8sDe78NwrOAPmJFjsR6XWtQ29C=fquoBvhCVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 17, 2019 at 2:48 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Well, I think everyone agrees there are workloads that cause undesired
> cache bloat. What we have not found is a solution that doesn't cause
> code complexity or undesired overhead, or one that >1% of users will
> know how to use.
>
> Unfortunately, because we have not found something we are happy with, we
> have done nothing. I agree LRU can be expensive. What if we do some
> kind of clock sweep and expiration like we do for shared buffers? I
> think the trick is figuring how frequently to do the sweep. What if we
> mark entries as unused every 10 queries, mark them as used on first use,
> and delete cache entries that have not be used in the past 10 queries.

I still think wall-clock time is a perfectly reasonable heuristic.
Say every 5 or 10 minutes you walk through the cache. Anything that
hasn't been touched since the last scan you throw away. If you do
this, you MIGHT flush an entry that you're just about to need again,
but (1) it's not very likely, because if it hasn't been touched in
many minutes, the chances that it's about to be needed again are low,
and (2) even if it does happen, it probably won't cost all that much,
because *occasionally* reloading a cache entry unnecessarily isn't
that costly; the big problem is when you do it over and over again,
which can easily happen with a fixed size limit on the cache, and (3)
if somebody does have a workload where they touch the same object
every 11 minutes, we can give them a GUC to control the timeout
between cache sweeps and it's really not that hard to understand how
to set it. And most people won't need to.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Raúl Marín Rodríguez 2019-01-18 20:56:44 Re: [PATCH] pgbench tap tests fail if the path contains a perl special character
Previous Message Robert Haas 2019-01-18 20:44:46 Re: Early WIP/PoC for inlining CTEs