Skip site navigation (1) Skip section navigation (2)

Re: [GENERAL] cache lookup of relation 165058647 failed

From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL Bugs List <pgsql-bugs(at)postgresql(dot)org>,Juris Krumins <juriskr(at)komin(dot)lv>
Subject: Re: [GENERAL] cache lookup of relation 165058647 failed
Date: 2004-05-05 20:40:54
Message-ID: 816FE1CE-9ED4-11D8-B669-000A95C705DC@chittenden.org (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-general
>>> I'v find out that this error occurs in:
>>>  dependency.c file
>>>
>>> 2004-04-26 11:09:34 ERROR:  dependency.c 1621: cache lookup of 
>>> relation
>>> 149064743 failed
>>> 2004-04-26 11:09:34 ERROR:  Relation "tmp_table1" does not exist
>>> 2004-04-26 11:09:34 ERROR:  Relation "tmp_table1" does not exist
>>>
>>> in getRelationDescription(StringInfo buffer, Oid relid) function.
>>>
>>> Any ideas what can cause this errors.
>> <aol>Me too.</aol>
>> But, I am suspecting that it's a race condition with the new 
>> background writer code.  I've started testing a new database design 
>> and was able to reproduce this on my laptop nearly 90% of the time, 
>> but could only reproduce it about 10% of the time on my production 
>> databases until I figured out what the difference was, fsync.
>
> temp tables don't use the shared buffer cache, how can this be related 
> to the BG writer?

Don't the system catalogs use the shared buffer cache?

BEGIN;
SELECT create_temp_table_func();  -- Inserts a row into pg_class via 
CREATE TEMP TABLE
-- Do other stuff
COMMIT;  			-- After the commit, the row is now visible to other 
backends
-- disconnect  	-- If the delay between the disconnect and reconnect is 
small enough
-- reconnect		-- It's as though there is a race condition that allows 
the function
				-- pg_table_is_visible() to assert the "cache lookup of relation"
				-- error.
BEGIN;
SELECT create_temp_table_func();  -- Before the CREATE TEMP TABLE, I 
call
							 /* SELECT TRUE FROM pg_catalog.pg_class c
								LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
								WHERE c.relname = ''footmp''::TEXT AND
								c.relkind = ''r''::TEXT AND
								pg_catalog.pg_table_is_visible(c.oid); */
							-- But the query fails

My guess was that the series of events went something like:

proc 0) COMMIT's and the row in pg_class is committed
proc 1) bgwriter writer code removes a page for the cache
proc 2) queries for the page  [*]
proc 1) writes it to disk
proc 2) queries for the page  [*]
proc 1) sync's the fd

[*] proc 2 queries for the page at either of these points

In 7.4, there is no bgwriter or background process mucking with cache, 
which is why this works 100% of the time.  In 7.5, however, there's a 
200ms gap where a race condition appears and pg_table_is_visible() 
fails its PointerIsValid() check.  If I put a sleep in, the sleep gives 
the bgwriter enough time to commit the pages to disk so that the 
queries for the page happen after the fd's been sync()'ed.

I have no other clue as to why this would be happening though, so 
believe me when I say, I could very well be quite wrong.... but this is 
my best, quasi-educated/grep(1)'ed guess.

-sc

-- 
Sean Chittenden


In response to

Responses

pgsql-bugs by date

Next:From: Jan WieckDate: 2004-05-06 03:30:11
Subject: Re: [GENERAL] cache lookup of relation 165058647 failed
Previous:From: Devrim GUNDUZDate: 2004-05-05 19:40:27
Subject: Re: Turkish locale bug

pgsql-general by date

Next:From: Richard HuxtonDate: 2004-05-05 20:41:33
Subject: Re: Load Balancing and Backup
Previous:From: Tom LaneDate: 2004-05-05 20:28:29
Subject: Re: vacuumdb is failing with NUMBER OF INDEX TUPLES NOT THE SAME AS HEAP

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group