Re: [GENERAL] cache lookup of relation 165058647 failed

From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL Bugs List <pgsql-bugs(at)postgresql(dot)org>, Juris Krumins <juriskr(at)komin(dot)lv>
Subject: Re: [GENERAL] cache lookup of relation 165058647 failed
Date: 2004-05-04 05:19:08
Message-ID: 91B18B5E-9D8A-11D8-8912-000A95C705DC@chittenden.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general

> I'v find out that this error occurs in:
> dependency.c file
>
> 2004-04-26 11:09:34 ERROR: dependency.c 1621: cache lookup of relation
> 149064743 failed
> 2004-04-26 11:09:34 ERROR: Relation "tmp_table1" does not exist
> 2004-04-26 11:09:34 ERROR: Relation "tmp_table1" does not exist
>
> in getRelationDescription(StringInfo buffer, Oid relid) function.
>
> Any ideas what can cause this errors.

<aol>Me too.</aol>

But, I am suspecting that it's a race condition with the new background
writer code. I've started testing a new database design and was able
to reproduce this on my laptop nearly 90% of the time, but could only
reproduce it about 10% of the time on my production databases until I
figured out what the difference was, fsync.

fsync was causing enough of a slow down that SearchSysCache() was
finding the tuple, whereas with fsync = false, it wasn't able to find
it. But, in search of proving that it wasn't fsync (I use fsync =
false on my laptop to save my pour drive), I threw in a sleep in
between my tests, and I'm able to get things to work 100% of the time
by adding a sleep. The following fails to work with fsync = false, 90%
of the time and with fsync = true, only 10% of the time.

% psql -f test-begin.sql template1 && psql -f test_enterprise_class.sql
&& psql -f test-end1.sql template1 && psql -f test-end2.sql template1

But, if I change the command to:

% psql -f test-begin.sql template1 && psql -f test_enterprise_class.sql
&& psql -f test-end1.sql template1 && sleep 1 && psql -f test-end2.sql
template1

I have no problems with cache relation misses. As for what happens in
those commands, I'm:

-- 1) Dropping the test database and re-creating it
-- 2) In a different connection, load a rather large schema as the dba
-- 3) Connect again and create a temp table
-- 4) Connect a second time, and check to see if the temp table exists

The sleep comes at step 3.5 in the above sequence of operations.

*boom* Here's a snippet of my terminal (the first thing I do after
BEGINning a transaction is create a temp table if it doesn't exist):

## BEGIN ##
[snip]
[...]
COMMIT
You are now connected to database "test" as user "usr".
BEGIN
psql:test-end2.sql:3: ERROR: cache lookup failed for relation 398033
CONTEXT: SQL query "SELECT TRUE FROM pg_catalog.pg_class c LEFT JOIN
pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relname =
'tmptbl'::TEXT AND c.relkind = 'r'::TEXT AND
pg_catalog.pg_table_is_visible(c.oid)"
PL/pgSQL function "create_tmptbl" line 2 at perform
PL/pgSQL function "check_or_populate_func" line 8 at assignment
PL/pgSQL function "setuid_wrapper_func" line 5 at return
## END ##

What's really bothering me is I can push the up arrow on the console,
run the exact same thing (including dropping the database), and it'll
work sometimes. Very disturbing. As I said, I'm *very* suspicious of
the background writer goo that Jan added simply because I can't think
of anything else that'd have this problem.

I've run each of those commands 100 times now, with and without the
sleep 1. With the sleep 1, it's worked 100% of the time. Jan, any bit
of code that comes to mind?

All of my bgwriter_* settings are set to their default.

-sc

--
Sean Chittenden

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Stephan Szabo 2004-05-04 05:32:22 Re: Erro
Previous Message Tom Lane 2004-05-04 05:06:40 Re: Erro

Browse pgsql-general by date

  From Date Subject
Next Message Greg Stark 2004-05-04 05:30:20 Re: Tracking structural changes from psql
Previous Message Bruce Momjian 2004-05-04 04:35:42 Re: wait_seconds in pg_ctl