daveg <daveg(at)sonic(dot)net> writes:
> Here is the update: the problem happens with vacuum full alone, no reindex
> is needed to trigger it. I updated the script to avoid reindexing after
> vacuum. Over the past two days there are still many ocurrances of this
> error coincident with the vacuum.
Well, that jives with the assumption that the one case we saw in
the buildfarm was the same thing, because the regression tests were
certainly only doing a VACUUM FULL and not a REINDEX of pg_class.
But it doesn't get us much closer to understanding what's happening.
In particular, it seems to knock out most ideas associated with race
conditions, because the VAC FULL should hold exclusive lock on pg_class
until it's completely done (including index rebuilds).
I think we need to start adding some instrumentation so we can get a
better handle on what's going on in your database. If I were to send
you a source-code patch for the server that adds some more logging
printout when this happens, would you be willing/able to run a patched
build on your machine?
(BTW, just to be perfectly clear ... the "could not find pg_class tuple"
errors always mention index 2662, right, never any other number?)
regards, tom lane
In response to
pgsql-hackers by date
|Next:||From: Alvaro Herrera||Date: 2011-07-31 17:45:08|
|Subject: Re: libedit memory stomp is apparently fixed in OS X Lion|
|Previous:||From: daveg||Date: 2011-07-31 08:17:01|
|Subject: Re: error: could not find pg_class tuple for index 2662|