Re: Some other CLOBBER_CACHE_ALWAYS culprits

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Some other CLOBBER_CACHE_ALWAYS culprits
Date: 2021-05-14 21:25:53
Message-ID: 20210514212553.osvbyllzls2ludtd@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-05-14 16:53:16 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > In essence, debug_invalidate_system_caches_always=1 in some important aspects
> > behaves like debug_invalidate_system_caches_always=3, due to the syscache
> > involvement.
>
> Yeah. I think it's important to test those recursive invalidation
> scenarios, but it could likely be done more selectively.

Agreed. I wonder if the logic could be something like indicating that we
don't invalidate due to pg_class/attribute/am/... (a set of super common
system catalogs) being opened, iff that open is at the "top level". So
we'd e.g. not trigger invalidation for a syscache miss scanning
pg_class, unless the miss happens during a relcache build. But we would
continue to trigger invalidations without further checks if
e.g. pg_subscription is opened.

> > What about having a mode where each "nesting" level of SearchCatCacheMiss
> > allows only one interior InvalidateSystemCaches()?
>
> An idea I'd been toying with was to make invals probabilistic, that is
> there would be X% chance of an inval being forced at any particular
> opportunity. Then you could dial X up or down to make a tradeoff
> between speed and the extent of coverage you get from a single run.
> (Over time, you could expect pretty complete coverage even with X
> not very close to 1, I think.)
>
> This could be extended to what you're thinking about by reducing X
> (according to some rule or other) for each level of cache-flush
> recursion. The argument to justify that is that recursive cache
> flushes are VERY repetitive, so that even a small probability will
> add up to full coverage of those code paths fairly quickly.

That'd make sense, I've been wondering about something similar. But I'm
a bit worried about that making it harder to reproduce problems
reliably?

> I've not worked out the math to justify any specific proposal
> along this line, though.

FWIW, I've prototyped the idea of only invalidating once for each
syscache level, and it does reduce runtime of

CREATE TABLE blarg_{0,1,2,3}(id serial primary key);
SET debug_invalidate_system_caches_always = 1;
SELECT * FROM blarg_0 join blarg_1 USING (id) join blarg_2 using (id) JOIN blarg_3 USING(id);
RESET ALL;

from 7.5s to 4.7s. The benefits are smaller when fewer tables are
accessed, and larger if more (surprising, right :)).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-05-14 21:36:51 Re: Some other CLOBBER_CACHE_ALWAYS culprits
Previous Message Tom Lane 2021-05-14 20:53:16 Re: Some other CLOBBER_CACHE_ALWAYS culprits