Re: Protect syscache from bloating with negative cache entries

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2017-12-18 16:46:53
Message-ID: 20171218164653.qg7sm6xtn7zfx2hi@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-12-17 19:23:45 -0500, Robert Haas wrote:
> On Sat, Dec 16, 2017 at 11:42 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> I'm not sure we should regard very quick bloating as a problem in need
> >> of solving. Doesn't that just mean we need the cache to be bigger, at
> >> least temporarily?
> >
> > Leaving that aside, is that actually not at least to a good degree,
> > solved by that problem? By bumping the generation on hash resize, we
> > have recency information we can take into account.
>
> I agree that we can do it. I'm just not totally sure it's a good
> idea. I'm also not totally sure it's a bad idea, either. That's why
> I asked the question.

I'm not 100% convinced either - but I also don't think it matters all
that terribly much. As long as the overall hash hit rate is decent,
minor increases in the absolute number of misses don't really matter
that much for syscache imo. I'd personally go for something like:

1) When about to resize, check if there's entries of a generation -2
around.

Don't resize if more than 15% of entries could be freed. Also, stop
reclaiming at that threshold, to avoid unnecessary purging cache
entries.

Using two generations allows a bit more time for cache entries to
marked as fresh before resizing next.

2) While resizing increment generation count by one.

3) Once a minute, increment generation count by one.

The one thing I'm not quite have a good handle upon is how much, and if
any, cache reclamation to do at 3). We don't really want to throw away
all the caches just because a connection has been idle for a few
minutes, in a connection pool that can happen occasionally. I think I'd
for now *not* do any reclamation except at resize boundaries.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-18 17:14:24 Re: Protect syscache from bloating with negative cache entries
Previous Message Andres Freund 2017-12-18 16:40:06 Re: pgsql: Provide overflow safe integer math inline functions.