Re: error: could not find pg_class tuple for index 2662

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: daveg <daveg(at)sonic(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: error: could not find pg_class tuple for index 2662
Date: 2011-08-04 20:16:08
Message-ID: 14320.1312488968@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

daveg <daveg(at)sonic(dot)net> writes:
> We are seeing "cannot read' and 'cannot open' errors too that would be
> consistant with trying to use a vanished file.

Yeah, these all seem consistent with the idea that the failing backend
somehow missed an update for the relation mapping file. You would get
the "could not find pg_class tuple" syndrome if the process was holding
an open file descriptor for the now-deleted file, and otherwise cannot
open/cannot read type errors. And unless it later received another
sinval message for the relation mapping file, the errors would persist.

If this theory is correct then all of the file-related errors ought to
match up to recently-vacuumed mapped catalogs or indexes (those are the
ones with relfilenode = 0 in pg_class). Do you want to expand your
logging of the VACUUM FULL actions and see if you can confirm that idea?

Since the machine is running RHEL, I think we can use glibc's
backtrace() function to get simple stack traces without too much effort.
I'll write and test a patch and send it along in a bit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ktm@rice.edu 2011-08-04 20:37:48 Re: PQescapeByteaConn - returns wrong string for PG9.1 Beta3
Previous Message Alvaro Herrera 2011-08-04 20:15:02 Re: cataloguing NOT NULL constraints