Re: failure with pg_dump

From: Mija Lee <mija(at)scharp(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: failure with pg_dump
Date: 2007-12-11 16:34:05
Message-ID: 475EBBFD.1020005@scharp.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

We've had a number of odd things that have been going on that I can't
really explain, and that don't seem to result in log entries. Here's
some info:

- this is running 8.2.4 on a solaris 10 machine
- I reran the dump after posting and these problems did not reoccur
- We have a number of replicated schemas and tables on this server.
There were other problems with the replication that happened earlier in
the evening.
- we have been having some very odd problems where our replication
scripts hang intermittantly. For the life of me I can't figure out why,
but when this happens, I look for processes that are idle in transaction
that are more than one day old and kill them. That seems to allow the
replication to finish. I have a few users that use a variety of
products to view and manipulate the data in these tables (tableau,
access, excel, ems, phppgadmin, dbvisualizer) and it seems like some
connections/transactions never terminate, but I can't figure out which
ones or why. I've been struggling with this problem for some time, but
have never had an issue with the stalled replication affecting the dump.
I was actually hoping that this error would help shed light on the
replication problem.

Mija

Tom Lane wrote:
> Mija Lee <mija(at)scharp(dot)org> writes:
>> I have a script that I use to do regular dumps of my database. Over the
>> weekend it failed, and produced the following error message. I'm not
>> sure why this would have happened, how I would find out which index is
>> referenced by 136451098, or where this select came from.
>
> It sounds like system catalog corruption, which is not good :-(.
>
>> pg_dump.sqlhost: Error message from server: ERROR: cache lookup failed
>> for index 136451098
>> pg_dump.sqlhost: The command was: SELECT t.tableoid, t.oid, t.relname as
>> indexname, pg_catalog.pg_get_indexdef(i.indexrelid) as indexdef,
>> t.relnatts as indnkeys, i.indkey, i.indisclustered, c.contype,
>> c.conname, c.tableoid as contableoid, c.oid as conoid, (SELECT spcname
>> FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) as
>> tablespace, array_to_string(t.reloptions, ', ') as options FROM
>> pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid =
>> i.indexrelid) LEFT JOIN pg_catalog.pg_depend d ON (d.classid =
>> t.tableoid AND d.objid = t.oid AND d.deptype = 'i') LEFT JOIN
>> pg_catalog.pg_constraint c ON (d.refclassid = c.tableoid AND d.refobjid
>> = c.oid) WHERE i.indrelid = '136451090'::pg_catalog.oid ORDER BY indexname
>
> That looks like pg_dump's query to get information about the indexes of
> a particular table. So apparently the problem index is one of the ones
> for the table with OID 136451090. The easiest way to find out which one
> that is is
> select '136451090'::regclass;
> Trying \d on each of that table's indexes in succession would tell you
> which one is trashed.
>
> As for fixing it, the $64 question is how extensive is the catalog
> corruption. I see no very good reason to hope that only this one index
> is affected :-(. What you probably want to do is try to get a clean
> pg_dump then initdb and reload --- at least that's how I'd approach it,
> rather than hoping that there's no lurking problems remaining after you
> hack your way around the one you can see.
>
> What I'd try first is a REINDEX on pg_class. If that doesn't help,
> try to delete the pg_index row linking 136451098 and 136451090.
>
> What PG version is this, anyway, and did anything weird happen on your
> system that might explain data corruption?
>
> regards, tom lane

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2007-12-11 17:03:36 Re: failure with pg_dump
Previous Message Isaac Vetter 2007-12-11 16:18:28 Re: schema question