Re: "could not open relation with OID" errors after promoting the standby to master

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: "could not open relation with OID" errors after promoting the standby to master
Date: 2012-05-17 12:24:37
Message-ID: CACw0+12+KMFtHxLujLggB9hH_FfVHt=dm2V4GM4ZmDcnVSQV6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 16, 2012 at 11:38 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Well, that is not surprising in itself -- InitTempTableNamespace calls
> RemoveTempRelations to cleanup from a possibly crashed previous backend
> with the same ID.  So that part of the backtrace looks normal to me
> (unless there is something weird going on, which might very well be the
> case).

Right, I guess the stack trace is okay but some state was obviously wrong.

I was able to clean that up now by some catalog hacking, but I'm
definitely going to dump and reload soon.

I found out that it was certain backend ids which couldn't create
temporary tables, meaning that when I did a "create temp table" in
these few certain backend ids (about 4-5 all with low id numbers which
is why I hit them quite often), it would give me this "could not open
relation with OID x" error.

I also couldn't drop the temp schema in these backends:

# drop schema pg_temp_4;
ERROR: cache lookup failed for relation 1990987636

# select oid, * from pg_namespace ;
(got oid 4664506 for "pg_temp_4")

# select * from pg_class where oid = 1990987636;
(no rows returned)

# delete from pg_namespace where oid = 4664506;
DELETE 1

# create temp table myfoo(a int);
CREATE TABLE

Later on I also found some leftover pg_type entries from temporary
tables that didn't exist anymore. I'm quite that certain I shouldn't
see these anymore... And I also find a few entries in pg_class with
relistemp='t' whose oid is considerably older than anything recent.
This kinda suggests that there might be something weird going on when
you have temp tables in flight and fail over, at least that's the only
explanation I have for how this could have happened.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-05-17 12:30:18 Re: counting pallocs
Previous Message Joshua Berkus 2012-05-17 12:22:12 Re: Why is indexonlyscan so darned slow?