Re: Protect syscache from bloating with negative cache entries

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <robertmhaas(at)gmail(dot)com>, <craig(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Protect syscache from bloating with negative cache entries
Date: 2017-01-23 22:54:36
Message-ID: 21803f50-a823-c444-ee2b-9a153114f454@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/21/17 6:42 PM, Jim Nasby wrote:
> On 12/26/16 2:31 AM, Kyotaro HORIGUCHI wrote:
>> The points of discussion are the following, I think.
>>
>> 1. The first patch seems working well. It costs the time to scan
>> the whole of a catcache that have negative entries for other
>> reloids. However, such negative entries are created by rather
>> unusual usages. Accesing to undefined columns, and accessing
>> columns on which no statistics have created. The
>> whole-catcache scan occurs on ATTNAME, ATTNUM and
>> STATRELATTINH for every invalidation of a relcache entry.
>
> I took a look at this. It looks sane, though I've got a few minor
> comment tweaks:
>
> + * Remove negative cache tuples maching a partial key.
> s/maching/matching/
>
> +/* searching with a paritial key needs scanning the whole cache */
>
> s/needs/means/
>
> + * a negative cache entry cannot be referenced so we can remove
>
> s/referenced/referenced,/
>
> I was wondering if there's a way to test the performance impact of
> deleting negative entries.

I did a make installcheck run with CATCACHE_STATS to see how often we
get negative entries in the 3 caches affected by this patch. The caches
on pg_attribute get almost no negative entries. pg_statistic gets a good
amount of negative entries, presumably because we start off with no
entries in there. On a stable system that presumably won't be an issue,
but if temporary tables are in use and being analyzed I'd think there
could be a moderate amount of inval traffic on that cache. I'll leave it
to a committer to decide if they thing that's an issue, but you might
want to try and quantify how big a hit that is. I think it'd also be
useful to know how much bloat you were seeing in the field.

The patch is currently conflicting against master though, due to some
caches being added. Can you rebase? BTW, if you set a slightly larger
context size on the patch you might be able to avoid rebases; right now
the patch doesn't include enough context to uniquely identify the chunks
against cacheinfo[].
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2017-01-23 23:03:16 Re: \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Previous Message Craig Ringer 2017-01-23 22:37:05 Re: Logical decoding on standby