Re: [HACKERS] Another nasty cache problem

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Another nasty cache problem
Date: 2000-01-30 17:57:58
Message-ID: 200001301757.MAA12393@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I'm down to the point where the parallel tests mostly work with a small
> SI buffer --- but they do still sometimes fail. I've realized that
> there is a whole class of bugs along the following lines:
>
> There are plenty of routines that do two or more SearchSysCacheTuple
> calls to get the information they need. As the code stands, it is
> unsafe to continue accessing the tuple returned by SearchSysCacheTuple
> after making a second such call, because the second call could possibly
> cause an SI cache reset message to be processed, thereby flushing the
> contents of the caches.

Yes, I have always been aware of this problem. The issue is that since
cache entries are removed on a oldest-removed-first basis, I never
thought that several cache lookups would be a problem. If you do many
cache lookups and expect very old ones to still exist, that could be a
problem.

However, a full reset of the cache could cause major problems. Is there
a way to re-load the cache after the reset with the most recently cached
entries? Seems that would be easier. However, your issue is probably
that the new cache entries would have different locations from the old
entries. Is it possible to delay the cache reset of the five most
recent cache entries, and do them later? I don't see many places where
more than 2-3 cache entries are kept. Maybe we need to keep them around
somehow during cache reset.

> I am not sure what to do about it. One solution path is to make
> all the potential trouble spots do SearchSysCacheTupleCopy and then
> pfree the copied tuple when done. However, that adds a nontrivial
> amount of overhead, and it'd be awfully easy to miss some trouble
> spots or to introduce new ones in the future.

Sounds like a lot of overhead to do the copy.

>
> Another possibility is to introduce some sort of notion of a reference
> count, and to make the standard usage pattern be
> tuple = SearchSysCacheTuple(...);
> ... use tuple ...
> ReleaseSysCacheTuple(tuple);
> The idea here is that a tuple with positive refcount would not be
> deleted during a cache reset, but would simply be removed from its
> cache, and then finally deleted when released (or during elog
> recovery).

If you can do that, can't you just keep the few most recent ones by
default. Seems that would be very clean.

> This might allow us to get rid of SearchSysCacheTupleCopy, too,
> since the refcount should be just as good as palloc'ing one's own
> copy for most purposes.

Yes, that would be nice.

--
Bruce Momjian | http://www.op.net/~candle
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-01-30 18:22:05 Re: [HACKERS] Another nasty cache problem
Previous Message Tom Lane 2000-01-30 15:41:13 Another nasty cache problem