|From:||Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>|
|Cc:||alvherre(at)2ndquadrant(dot)com, bruce(at)momjian(dot)us, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, craig(at)2ndquadrant(dot)com|
|Subject:||Re: Protect syscache from bloating with negative cache entries|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Thank you for testing and the commits, Tomas.
At Sat, 9 Feb 2019 19:09:59 +0100, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <74386116-0bc5-84f2-e614-0cff19aca2de(at)2ndquadrant(dot)com>
> On 2/7/19 1:18 PM, Kyotaro HORIGUCHI wrote:
> > At Thu, 07 Feb 2019 15:24:18 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20190207(dot)152418(dot)139132570(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> I've done a bunch of benchmarks on v13, and I don't see any serious
> regression either. Each test creates a number of tables (100, 1k, 10k,
> 100k and 1M) and then runs SELECT queries on them. The tables are
> accessed randomly - with either uniform or exponential distribution. For
> each combination there are 5 runs, 60 seconds each (see the attached
> shell scripts, it should be pretty obvious).
> I've done the tests on two different machines - small one (i5 with 8GB
> of RAM) and large one (e5-2620v4 with 64GB RAM), but the behavior is
> almost exactly the same (with the exception of 1M tables, which does not
> fit into RAM on the smaller one).
> On the xeon, the results (throughput compared to master) look like this:
> uniform 100 1000 10000 100000 1000000
> v13 105.04% 100.28% 102.96% 102.11% 101.54%
> v13 (nodata) 97.05% 98.30% 97.42% 96.60% 107.55%
> exponential 100 1000 10000 100000 1000000
> v13 100.04% 103.48% 101.70% 98.56% 103.20%
> v13 (nodata) 97.12% 98.43% 98.86% 98.48% 104.94%
> The "nodata" case means the tables were empty (so no files created),
> while in the other case each table contained 1 row.
> Per the results it's mostly break even, and in some cases there is
> actually a measurable improvement.
Great! I guess it comes from reduced size of hash?
> That being said, the question is whether the patch actually reduces
> memory usage in a useful way - that's not something this benchmark
> validates. I plan to modify the tests to make pgbench script
> time-dependent (i.e. to pick a subset of tables depending on time).
> A couple of things I've happened to notice during a quick review:
> 1) The sgml docs in 0002 talk about "syscache_memory_target" and
> "syscache_prune_min_age", but those options were renamed to just
> "cache_memory_target" and "cache_prune_min_age".
I'm at a loss how call syscache for users. I think it is "catalog
cache". The most basic component is called catcache, which is
covered by the syscache layer, both of then are not revealed to
users, and it is shown to user as "catalog cache".
"catalog_cache_prune_min_age", "catalog_cache_memory_target", (if
exists) "catalog_cache_entry_limit" and
"catalog_cache_prune_ratio" make sense?
> 2) "cache_entry_limit" is not mentioned in sgml docs at all, and it's
> defined three times in guc.c for some reason.
It is just PoC, added to show how it looks. (The multiple
instances must bex a result of a convulsion of my fingers..) I
think this is not useful unless it can be specfied per-relation
or per-cache basis. I'll remove the GUC and add reloptions for
the purpose. (But it won't work for pg_class and pg_attribute
> 3) I don't see why to define PRUNE_BY_AGE and PRUNE_BY_NUMBER, instead
> of just using two bool variables prune_by_age and prune_by_number doing
> the same thing.
Agreed. It's a kind of memory-stingy, which is useless there.
> 4) I'm not entirely sure about using stmtStartTimestamp. Doesn't that
> pretty much mean long-running statements will set the lastaccess to very
> old timestamp? Also, it means that long-running statements (like a PL
> function accessing a bunch of tables) won't do any eviction at all, no?
> AFAICS we'll set the timestamp only once, at the very beginning.
> I wonder whether using some other timestamp source (like a timestamp
> updated regularly from a timer, or something like that).
I didin't consider planning that happen within a function. If
5min is the default for catalog_cache_prune_min_age, 10% of it
(30s) seems enough and gettieofday() with such intervals wouldn't
affect forground jobs. I'd choose catalog_c_p_m_age/10 rather
than fixed value 30s and 1s as the minimal.
I obeserved significant degradation by setting up timer at every
statement start. The patch is doing the followings to get rid of
(1) Every statement updates the catcache timestamp as currently
(2) The timestamp is also updated periodically using timer
separately from (1). The timer starts if not yet at the time
of (1). (SetCatCacheClock, UpdateCatCacheClock)
(3) Statement end and transaction end don't stop the timer, to
avoid overhead of setting up a timer. (
(4) But it stops by error. I choosed not to change the thing in
PostgresMain that it kills all timers on error.
(5) Also changing the GUC catalog_cache_prune_min_age kills the
timer, in order to reflect the change quickly especially when
it is shortened.
> 5) There are two fread() calls in 0003 triggering a compiler warning
> about unused return value.
Ugg. It's in PoC style... (But my compiler didn't complain about
it) Maybe fixed.
NTT Open Source Software Center
|Next Message||Kyotaro HORIGUCHI||2019-02-12 11:36:28||Re: Protect syscache from bloating with negative cache entries|
|Previous Message||Matsumura, Ryo||2019-02-12 10:36:58||RE: [PROPOSAL]a new data type 'bytea' for ECPG|