Re: Protect syscache from bloating with negative cache entries

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, Jim(dot)Nasby(at)bluetreble(dot)com, craig(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2018-03-15 05:12:46
Message-ID: 20180315.141246.130742928.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

At Mon, 12 Mar 2018 17:34:08 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20180312(dot)173408(dot)162882093(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> > > In short, it's not really apparent to me that negative syscache entries
> > > are the major problem of this kind. I'm afraid that you're drawing very
> > > large conclusions from a specific workload. Maybe we could fix that
> > > workload some other way.
> >
> > The current patch doesn't consider whether an entry is negative
> > or positive(?). Just clean up all entries based on time.
> >
> > If relation has to have the same characterictics to syscaches, it
> > might be better be on the catcache mechanism, instaed of adding
> > the same pruning mechanism to dynahash..
>
> For the moment, I added such feature to dynahash and let only
> relcache use it in this patch. Hash element has different shape
> in "prunable" hash and pruning is performed in a similar way
> sharing the setting with syscache. This seems working fine.

I gave consideration on plancache. The most different
characteristics from catcache and relcache is the fact that it is
not voluntarily removable since CachedPlanSource, the root struct
of a plan cache, holds some indispensable inforamtion. In regards
to prepared queries, even if we store the information into
another location, for example in "Prepred Queries" hash, it
merely moving a big data into another place.

Looking into CachedPlanSoruce, generic plan is a part that is
safely removable since it is rebuilt as necessary. Keeping "old"
plancache entries not holding a generic plan can reduce memory
usage.

For testing purpose, I made 50000 parepared statement like
"select sum(c) from p where e < $" on 100 partitions,

With disabling the feature (0004 patch) VSZ of the backend
exceeds 3GB (It is still increasing at the moment), while it
stops to increase at about 997MB for min_cached_plans = 1000 and
plancache_prune_min_age = '10s'.

# 10s is apparently short for acutual use, of course.

It is expected to be significant amount if the plan is large
enough but I'm still not sure it is worth doing, or is a right
way.

The attached is the patch set including this plancache stuff.

0001- catcache time-based expiration (The origin of this thread)
0002- introduces dynahash pruning feature
0003- implement relcache pruning using 0002
0004- (perhaps) independent from the three above. PoC of
plancache pruning. Details are shown above.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch text/x-patch 13.1 KB
0002-introduce-dynhash-pruning.patch text/x-patch 13.6 KB
0003-Apply-purning-to-relcache.patch text/x-patch 1.7 KB
0004-PoC-of-generic-plan-removal-of-PlanCacheSource.patch text/x-patch 14.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-15 06:09:54 Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs
Previous Message Kyotaro HORIGUCHI 2018-03-15 05:03:15 Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs