From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | hlinnaka(at)iki(dot)fi |
Cc: | andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, alvherre(at)2ndquadrant(dot)com, bruce(at)momjian(dot)us, pgsql-hackers(at)lists(dot)postgresql(dot)org, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, craig(at)2ndquadrant(dot)com |
Subject: | Re: Protect syscache from bloating with negative cache entries |
Date: | 2021-01-28 07:50:44 |
Message-ID: | 20210128.165044.1288517296648402194.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Wed, 27 Jan 2021 13:11:55 +0200, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote in
> On 27/01/2021 03:13, Kyotaro Horiguchi wrote:
> > At Thu, 14 Jan 2021 17:32:27 +0900 (JST), Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> >> The commit 4656e3d668 (debug_invalidate_system_caches_always)
> >> conflicted with this patch. Rebased.
> > At Wed, 27 Jan 2021 10:07:47 +0900 (JST), Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> >> (I found a bug in a benchmark-aid function
> >> (CatalogCacheFlushCatalog2), I repost an updated version soon.)
> > I noticed that a catcachebench-aid function
> > CatalogCacheFlushCatalog2() allocates bucked array wrongly in the
> > current memory context, which leads to a crash.
> > This is a fixed it then rebased version.
>
> Thanks, with the scripts you provided, I was able to run the
> performance tests on my laptop, and got very similar results as you
> did.
>
> The impact of v7-0002-Remove-dead-flag-from-catcache-tuple.patch is
> very small. I think I could see it in the tests, but only barely. And
> the tests did nothing else than do syscache lookups; in any real world
> scenario, it would be lost in noise. I think we can put that aside for
> now, and focus on v6-0001-CatCache-expiration-feature.patch:
I agree to that opinion. But a bit dissapointing that the long
struggle ended up in vain:p
> The pruning is still pretty lethargic:
>
> - Entries created in the same transactions are never pruned away
>
> - The size of the hash table is never shrunk. So even though the patch
> - puts a backstop to the hash table growing indefinitely, if you run one
> - transaction that bloats the cache, it's bloated for the rest of the
> - session.
Right. But more frequent check impacts on performance. We can do more
aggressive pruning in idle-time.
> I think that's OK. We might want to be more aggressive in the future,
> but for now it seems reasonable to lean towards the current behavior
> where nothing is pruned. Although I wonder if we should try to set
> 'catcacheclock' more aggressively. I think we could set it whenever
> GetCurrentTimestamp() is called, for example.
Ah. I didn't thought that direction. global_last_acquired_timestamp or
such?
> Given how unaggressive this mechanism is, I think it should be safe to
> enable it by default. What would be a suitable default for
> catalog_cache_prune_min_age? 30 seconds?
Without a detailed thought, it seems a bit too short. The
ever-suggested value for the variable is 300-600s. That is,
intermittent queries with about 5-10 minutes intervals don't lose
cache entries.
In a bad case, two queries alternately remove each other's cache
entries.
Q1: adds 100 entries
<1 minute passed>
Q2: adds 100 entries but rehash is going to happen at 150 entries and
the existing 100 entreis added by Q1 are removed.
<1 minute passed>
Q1: adds 100 entries but rehash is going to happen at 150 entries and
the existing 100 entreis added by Q2 are removed.
<repeats>
Or a transaction sequence persists longer than that seconds may lose
some of the catcache entries.
> Documentation needs to be updated for the new GUC.
>
> Attached is a version with a few little cleanups:
> - use TimestampTz instead of uint64 for the timestamps
> - remove assign_catalog_cache_prune_min_age(). All it did was convert
> - the GUC's value from seconds to microseconds, and stored it in a
> - separate variable. Multiplication is cheap, so we can just do it when
> - we use the GUC's value instead.
Yeah, the laater is a trace of the struggle for cutting down cpu
cycles in the normal paths. I don't object to do so.
I found that some comments are apparently stale. cp->cc_oldest_ts is
not used anywhere, but it is added for the decision of whether to scan
or not.
I fixed the following points in the attached.
- Removed some comments that is obvious. ("Timestamp in us")
- Added cp->cc_oldest_ts check in CatCacheCleanupOldEntries.
- Set the default value for catalog_cache_prune_min_age to 600s.
- Added a doc entry for the new GUC in the resoruce/memory section.
- Fix some code comments.
- Adjust pruning criteria from (ct->lastaccess < prune_threshold) to <=.
I was going to write in the doc something like "you can inspect memory
consumption by catalog caches using pg_backend_memory_contexts", but
all the memory used by catalog cache is in CacheMemoryContext. Is it
sensible for each catalog cache to have their own contexts?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v9-0001-CatCache-expiration-feature.patch | text/x-patch | 8.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2021-01-28 08:16:52 | Re: Protect syscache from bloating with negative cache entries |
Previous Message | Michael Paquier | 2021-01-28 07:36:39 | Re: [PATCH] remove pg_standby |