Re: [HACKERS] Another nasty cache problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Another nasty cache problem
Date: 2000-01-31 15:24:23
Message-ID: 9850.949332263@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <e99re41(at)DoCS(dot)UU(dot)SE> writes:
> This sort of thing should be documented,

... or changed ...

> Anyway, I just counted 254 uses of SearchSysCacheTuple in the backend tree
> and a majority of these are probably obviously innocent. Since I don't
> have any more developing planned, I would volunteer to take a look at all
> of those and look for violations of second cache look up, heap_open, and
> CommandCounterIncrement, fixing them where possible, or at least pointing
> them out to more experienced people. That might save you from going out of
> your way and instituting some reference count or whatever, and it would be
> an opportunity for me to read some code.

I appreciate the offer, but I don't really want to fix it that way.
If that's how things have to work, then the code will be *extremely*
fragile --- any routine that opens a relation or looks up a cache tuple
will potentially break its callers as well as itself. And since the
probability of failure is so low, we'll never find it; we'll just keep
getting the occasional irreproducible failure report from the field.
I think we need a designed-in solution rather than a restrictive coding
rule.

Also, I am not sure that the existing uses are readily fixable. For
example, I saw a number of crashes in the parser last night, most of
which traced to uses of Operator or Type pointers --- which are really
SearchSysCacheTuple results, but the parser passes them around with wild
abandon. I don't see any easy way of restructuring that code to avoid
this.

I am starting to think that Bruce's idea might be the way to go: lock
down any cache entry that's been referenced since the last transaction
start or CommandCounterIncrement, and elog() if it's changed by
invalidation. Then the only coding rule needed is "cached tuples don't
stay valid across CommandCounterIncrement", which is relatively
simple to check for.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2000-01-31 15:26:23 Re: [HACKERS] Re: Case-folding bogosity in new psql
Previous Message Tom Lane 2000-01-31 15:08:43 Re: Case-folding bogosity in new psql