Re: Protect syscache from bloating with negative cache entries

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "michael(dot)paquier(at)gmail(dot)com" <michael(dot)paquier(at)gmail(dot)com>, "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "Jim(dot)Nasby(at)bluetreble(dot)com" <Jim(dot)Nasby(at)bluetreble(dot)com>, "craig(at)2ndquadrant(dot)com" <craig(at)2ndquadrant(dot)com>
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-01-17 22:46:03
Message-ID: 4e62e6b7-0ffb-54ae-3757-5583fcca38c0@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18/01/2019 08:48, Bruce Momjian wrote:
> On Thu, Jan 17, 2019 at 11:33:35AM -0500, Robert Haas wrote:
>> The flaw in your thinking, as it seems to me, is that in your concern
>> for "the likelihood that cache flushes will simply remove entries
>> we'll soon have to rebuild," you're apparently unwilling to consider
>> the possibility of workloads where cache flushes will remove entries
>> we *won't* soon have to rebuild. Every time that issue gets raised,
>> you seem to blow it off as if it were not a thing that really happens.
>> I can't make sense of that position. Is it really so hard to imagine
>> a connection pooler that switches the same connection back and forth
>> between two applications with different working sets? Or a system
>> that keeps persistent connections open even when they are idle? Do
>> you really believe that a connection that has not accessed a cache
>> entry in 10 minutes still derives more benefit from that cache entry
>> than it would from freeing up some memory?
> Well, I think everyone agrees there are workloads that cause undesired
> cache bloat. What we have not found is a solution that doesn't cause
> code complexity or undesired overhead, or one that >1% of users will
> know how to use.
>
> Unfortunately, because we have not found something we are happy with, we
> have done nothing. I agree LRU can be expensive. What if we do some
> kind of clock sweep and expiration like we do for shared buffers? I
> think the trick is figuring how frequently to do the sweep. What if we
> mark entries as unused every 10 queries, mark them as used on first use,
> and delete cache entries that have not be used in the past 10 queries.
>
If you take that approach, then this number should be configurable. 
What if I had 12 common queries I used in rotation?

The ARM3 processor cache logic was to simply eject an entry at random,
as the obviously Acorn felt that the silicon required to have a more
sophisticated algorithm would reduce the cache size too much!

I upgraded my Acorn Archimedes that had an 8MHZ bus, from an 8MHz ARM2
to a 25MZ ARM3. that is a clock rate improvement of about 3 times. 
However BASIC programs ran about 7 times faster, which I put down to the
ARM3 having a cache.

Obviously for Postgres this is not directly relevant, but I think it
suggests that it may be worth considering replacing cache items at
random.  As there are no pathological corner cases, and the logic is
very simple.

Cheers,
Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-01-17 22:47:23 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)
Previous Message Andreas Karlsson 2019-01-17 22:40:52 Re: Feature: temporary materialized views