Re: SearchSysCacheTuple(Copy)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SearchSysCacheTuple(Copy)
Date: 2000-11-14 17:45:42
Message-ID: 2898.974223942@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I said:
> This class of bugs has been there since the beginning of Postgres,
> so I do not feel that we need to panic about it. Let's take the
> time to design and implement a proper solution, rather than expending
> effort on a stopgap solution that'll have to be undone later.

I've reviewed the earlier threads on SearchSysCache and done some more
thinking. Here is a concrete proposal for doing it right:

1. Add a reference-count field and an already-dead boolean to each
syscache entry. (The already-dead flag indicates that we received
an SI inval message for this tuple, but we couldn't delete it yet
because it has refcount > 0. A tuple with this flag set will never
be returned by a new SearchSysCache call, and it will be deleted
from the cache as soon as its refcount goes to zero.)

2. SearchSysCache() will increment the reference count of the tuple
it returns.

3. A new entry point ReleaseSysCache() will be provided to decrement
the reference count of a syscache entry. Thus, the standard calling
sequence becomes

tuple = SearchSysCacheTuple(...);
if (HeapTupleIsValid(tuple))
{
... use tuple ...
ReleaseSysCache(tuple);
}

4. SearchSysCacheTupleCopy() goes away, since we may as well use
SearchSysCacheTuple() and ReleaseSysCache() instead of
SearchSysCacheTupleCopy() and heap_freetuple().

5. Since SearchSysCache() is called from exactly one place, namely
SearchSysCacheTuple(), I am inclined to rename SearchSysCache()
to SearchCatCache() and then give the name SearchSysCache() to
the more widely used routine. So SearchSysCache() and
ReleaseSysCache() would really be the widely used pair of routines.

6. When a shared-cache-inval message arrives, the syscache code
behaves as follows for each affected cache entry:
A. If refcount is zero, just delete the entry.
B. Otherwise, set the "already-dead" flag, so that future cache
searches will not return this tuple and it will be released
once its refcount reaches zero.

7. At end of transaction (whether normal or abort), scan the syscaches
to reset refcounts to zero and delete any marked-dead entries. We
should not consider it a software error to leave syscache entries
still locked at end of transaction. (The parser, in particular,
would need a lot of work to avoid doing so, and I don't see much
value in expending such work.)

Note that this proposal does not include any attempt to detect whether
a cache inval message means that the tuple has actually been changed
(as opposed to just relocated, for example). It seems too expensive
to go out and re-read such tuples, and I'm not sure that it's safe to
try to read new cache entries during cache inval anyway. Besides,
most callers that are using a cache entry are happy to continue to use
the copy that was valid when they got it --- they don't care if a
subsequent transaction commit changes the tuple.

Callers that want to be certain they have a completely-up-to-date copy
should acquire a suitable lock on the associated system relation before
calling SearchSysCache(). In practice, the only callers that really need
this are places that are going to update or delete the tuple, and so they
need to acquire a write lock on the system relation anyway. The coding
rule is then just "lock the relation before finding the tuple to update,
not after". We have that rule in place already.

Comments? I think I might have time to do this before Vadim finishes
with WAL ;-)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-11-14 17:48:35 Re: Re: UUNET socket-file-location patch
Previous Message Peter Eisentraut 2000-11-14 17:45:36 Re: Re: UUNET socket-file-location patch