Re: Protect syscache from bloating with negative cache entries

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, alvherre(at)2ndquadrant(dot)com
Cc: bruce(at)momjian(dot)us, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, craig(at)2ndquadrant(dot)com
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2019-02-09 18:09:59
Message-ID: 74386116-0bc5-84f2-e614-0cff19aca2de@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/7/19 1:18 PM, Kyotaro HORIGUCHI wrote:
> At Thu, 07 Feb 2019 15:24:18 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20190207(dot)152418(dot)139132570(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
>> I'm going to retake numbers with search-only queries.
>
> Yeah, I was stupid.
>
> I made a rerun of benchmark using "-S -T 30" on the server build
> with no assertion and -O2. The numbers are the best of three
> successive attempts. The patched version is running with
> cache_target_memory = 0, cache_prune_min_age = 600 and
> cache_entry_limit = 0 but pruning doesn't happen by the workload.
>
> master: 13393 tps
> v12 : 12625 tps (-6%)
>
> Significant degradation is found.
>
> Recuded frequency of dlist_move_tail by taking 1ms interval
> between two succesive updates on the same entry let the
> degradation dissapear.
>
> patched : 13720 tps (+2%)
>
> I think there's still no need of such frequency. It is 100ms in
> the attched patch.
>
> # I'm not sure the name LRU_IGNORANCE_INTERVAL makes sens..
>

Hi,

I've done a bunch of benchmarks on v13, and I don't see any serious
regression either. Each test creates a number of tables (100, 1k, 10k,
100k and 1M) and then runs SELECT queries on them. The tables are
accessed randomly - with either uniform or exponential distribution. For
each combination there are 5 runs, 60 seconds each (see the attached
shell scripts, it should be pretty obvious).

I've done the tests on two different machines - small one (i5 with 8GB
of RAM) and large one (e5-2620v4 with 64GB RAM), but the behavior is
almost exactly the same (with the exception of 1M tables, which does not
fit into RAM on the smaller one).

On the xeon, the results (throughput compared to master) look like this:

uniform 100 1000 10000 100000 1000000
------------------------------------------------------------
v13 105.04% 100.28% 102.96% 102.11% 101.54%
v13 (nodata) 97.05% 98.30% 97.42% 96.60% 107.55%

exponential 100 1000 10000 100000 1000000
------------------------------------------------------------
v13 100.04% 103.48% 101.70% 98.56% 103.20%
v13 (nodata) 97.12% 98.43% 98.86% 98.48% 104.94%

The "nodata" case means the tables were empty (so no files created),
while in the other case each table contained 1 row.

Per the results it's mostly break even, and in some cases there is
actually a measurable improvement.

That being said, the question is whether the patch actually reduces
memory usage in a useful way - that's not something this benchmark
validates. I plan to modify the tests to make pgbench script
time-dependent (i.e. to pick a subset of tables depending on time).

A couple of things I've happened to notice during a quick review:

1) The sgml docs in 0002 talk about "syscache_memory_target" and
"syscache_prune_min_age", but those options were renamed to just
"cache_memory_target" and "cache_prune_min_age".

2) "cache_entry_limit" is not mentioned in sgml docs at all, and it's
defined three times in guc.c for some reason.

3) I don't see why to define PRUNE_BY_AGE and PRUNE_BY_NUMBER, instead
of just using two bool variables prune_by_age and prune_by_number doing
the same thing.

4) I'm not entirely sure about using stmtStartTimestamp. Doesn't that
pretty much mean long-running statements will set the lastaccess to very
old timestamp? Also, it means that long-running statements (like a PL
function accessing a bunch of tables) won't do any eviction at all, no?
AFAICS we'll set the timestamp only once, at the very beginning.

I wonder whether using some other timestamp source (like a timestamp
updated regularly from a timer, or something like that).

5) There are two fread() calls in 0003 triggering a compiler warning
about unused return value.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
run-data.sh application/x-shellscript 1.8 KB
run-nodata.sh application/x-shellscript 1.7 KB
syscache.ods application/vnd.oasis.opendocument.spreadsheet 14.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-02-09 18:56:46 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)
Previous Message Amit Langote 2019-02-09 17:51:38 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)