RE: Protect syscache from bloating with negative cache entries

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: 'Bruce Momjian' <bruce(at)momjian(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "michael(dot)paquier(at)gmail(dot)com" <michael(dot)paquier(at)gmail(dot)com>, "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "craig(at)2ndquadrant(dot)com" <craig(at)2ndquadrant(dot)com>
Subject: RE: Protect syscache from bloating with negative cache entries
Date: 2019-02-19 07:08:14
Message-ID: 0A3221C70F24FB45833433255569204D1FB9D860@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: 'Bruce Momjian' [mailto:bruce(at)momjian(dot)us]
> I think, in general, smaller is better, as long as making something
> smaller doesn't remove data that is frequently accessed. Having a timer
> to expire only old entries seems like it accomplished this goal.
>
> Having a minimum size and not taking it to zero size makes sense if we
> know we will need certain entries like pg_class in the next query.
> However, if the session is idle for hours, we should just probably
> remove everything, so maybe the minimum doesn't make sense --- just
> remove everything.

That's another interesting idea. A somewhat relevant feature is Oracle's "ALTER SYSTEM FLUSH SHARED_POOL". It flushes all dictionary cache, library cache, and SQL plan entries. The purpose is different: not to release memory, but to defragment the shared memory.

> I don't think other DBMSs are a good model since they have a reputation
> for requiring a lot of tuning --- tuning that we have often automated.

Yeah, I agree that PostgreSQL is easier to use in many aspects.

On the other hand, although I hesitate to say this (please don't get upset...), I feel PostgreSQL is a bit too loose about memory usage. To my memory, PostgreSQL crashed OS due to OOM in our user environments:

* Creating and dropping temp tables repeatedly in a stored PL/pgSQL function. This results in infinite CacheMemoryContext bloat. This is referred to at the beginning of this mail thread.
Oracle and MySQL can limit the size of the dictionary cache.

* Each pair of SAVEPOINT/RELEASE leaves 8KB of CurTransactionContext. The customer used psqlODBC to run a batch app, which ran millions of SQL statements in a transaction. psqlODBC wraps each SQL statement with SAVEPOINT and RELEASE by default.
I guess this is what caused the crash of AWS Aurora in last year's Amazon Prime Day.

* Setting a large value to work_mem, and then run many concurrent large queries.
Oracle can limit the total size of all sessions' memory with PGA_AGGREGATE_TARGET parameter.

We all have to manage things within resource constraints. The DBA wants to make sure the server doesn't overuse memory to avoid crash or slowdown due to swapping. Oracle does it, and another open source database, MySQL, does it too. PostgreSQL does it with shared_buffers, wal_buffers, and work_mem (within a single session). Then, I thought it's natural to do it with catcache/relcache/plancache.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-02-19 07:10:21 Re: Change of email address
Previous Message Thomas Munro 2019-02-19 07:03:05 Some thoughts on NFS