Re: Protect syscache from bloating with negative cache entries

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: alvherre(at)2ndquadrant(dot)com
Cc: tomas(dot)vondra(at)2ndquadrant(dot)com, bruce(at)momjian(dot)us, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, craig(at)2ndquadrant(dot)com
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-02-07 06:24:18
Message-ID: 20190207.152418.139132570.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

At Tue, 5 Feb 2019 19:05:26 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in <20190205220526(dot)GA1442(at)alvherre(dot)pgsql>
> On 2019-Feb-05, Tomas Vondra wrote:
>
> > I don't think we need to remove the expired entries right away, if there
> > are only very few of them. The cleanup requires walking the hash table,
> > which means significant fixed cost. So if there are only few expired
> > entries (say, less than 25% of the cache), we can just leave them around
> > and clean them if we happen to stumble on them (although that may not be
> > possible with dynahash, which has no concept of expiration) of before
> > enlarging the hash table.
>
> I think seqscanning the hash table is going to be too slow; Ideriha-san
> idea of having a dlist with the entries in LRU order (where each entry
> is moved to head of list when it is touched) seemed good: it allows you
> to evict older ones when the time comes, without having to scan the rest
> of the entries. Having a dlist means two more pointers on each cache
> entry AFAIR, so it's not a huge amount of memory.

Ah, I had a separate list in my mind. Sounds reasonable to have
pointers in cache entry. But I'm not sure how much additional
dlist_* impact.

The attached is the new version with the following properties:

- Both prune-by-age and hard limiting feature.
(Merged into single function, single scan)
Debug tracking feature in CatCacheCleanupOldEntries is removed
since it no longer runs a full scan.

Prune-by-age can be a single-setup-for-all-cache feature but
the hard limit is obviously not. We could use reloptions for
the purpose (which is not currently available on pg_class and
pg_attribute:p). I'll add that if there's no strong objection.
Or is there anyone comes up with something sutable for the
purpose?

- Using LRU to get rid of full scan.

I added new API dlist_move_to_tail which was needed to construct LRU.

I'm going to retake numbers with search-only queries.

> > So if we want to address this case too (and we probably want), we may
> > need to discard the old cache memory context someho (e.g. rebuild the
> > cache in a new one, and copy the non-expired entries). Which is a nice
> > opportunity to do the "full" cleanup, of course.
>
> Yeah, we probably don't want to do this super frequently though.

MemoryContext per cache?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v12-0001-Add-dlist_move_tail.patch text/x-patch 1.2 KB
v12-0002-Remove-entries-that-haven-t-been-used-for-a-certain-.patch text/x-patch 16.9 KB
v12-0003-Syscache-usage-tracking-feature.patch text/x-patch 35.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2019-02-07 06:41:03 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message Noah Misch 2019-02-07 05:50:11 Re: Synchronize with imath upstream