Re: Cache lookup failure for index during pg_dump

From: Bob Lunney <bob_lunney(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Cache lookup failure for index during pg_dump
Date: 2010-02-20 20:10:39
Message-ID: 724775.65373.qm@web39704.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom,

Thanks for the detailed explanation - I appreciate it. If i understand correctly I should be able to just change the job to not drop and recreate the indexes and the dump will run without the cache lookup error. I'll take a hit in performance reloading the table, but I like database dumps more than speed at this point.

Could pg_dump also take an AccessShared lock on the system tables to prevent DDL changes during the dump, thereby preventing this error? Just a thought...

Bob Lunney

--- On Fri, 2/19/10, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Subject: Re: [BUGS] Cache lookup failure for index during pg_dump
> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com>
> Cc: pgsql-bugs(at)postgresql(dot)org
> Date: Friday, February 19, 2010, 7:39 PM
> Bob Lunney <bob_lunney(at)yahoo(dot)com>
> writes:
> > I have a job that summarizes some data from a table,
> truncates the table, drops the indexes, writes the
> summarized data back into the table, then recreates the
> indexes.  The operations above are not in a single
> transaction, but separate statements executed by a
> script.   Easy, runs great, has for
> years. 
> > Recently the job takes a little longer to run and is
> still going when the database dump starts. That's when I
> started getting this:
>
> > ERROR:  cache lookup failed for index 70424
>
> > My questions are: can making DDL changes during a dump
> cause this error?  Are the queries used by pg_dump
> transactionally consistent, i.e. do they run in a
> transaction and get a single view of the database system
> catalogs?  Other than finer coordination of jobs, how
> can this situation be avoided?
>
> It's a bit messy.  pg_dump runs in a serializable
> transaction, so it
> sees a consistent snapshot of the database including system
> catalogs.
> However, it relies in part on various specialized backend
> functions like
> pg_get_indexdef(), and those things tend to run on
> SnapshotNow time, ie
> they look at the currently committed state.  So it is
> possible to get
> this type of error if someone performs DDL changes while a
> dump is
> happening: pg_dump sees index 70424 still listed in the
> catalogs,
> so it asks about it, and the backend says "there is no such
> index",
> which there isn't anymore because somebody dropped it since
> pg_dump's
> transaction started.
>
> The window for this sort of thing isn't very large, because
> the first
> thing pg_dump does is acquire AccessShareLock on every
> table it intends
> to dump, and past that point it won't be possible for
> anyone to modify
> the table's DDL.  But it can happen.
>
> The right fix for this is to make all those inquiry
> functions use the
> calling query's snapshot; but duplicating a lot of backend
> infrastructure is going to be a major pain in the rear, so
> the
> discussion has kind of petered out every time it's come up
> in the past.
>
> In practice, because pg_dump does lock out DDL changes for
> the bulk of
> its run, it's not a great idea to be scheduling
> DDL-changing jobs during
> your dumps anyhow.  Most of the time they'll just get
> blocked till the
> dump finishes, and if they touch more than one table it's
> not at all
> unlikely for them to end up deadlocked against pg_dump's
> locks.  A fix
> for the snapshot-timing problem wouldn't do a thing for
> that problem.
>
> So in short, the path of least resistance is to reschedule
> your dumps.
> Or reconsider whether you really need to drop and recreate
> those indexes
> --- could you use REINDEX instead?
>
>            
> regards, tom lane
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2010-02-20 20:47:00 Re: Cache lookup failure for index during pg_dump
Previous Message Robert Haas 2010-02-20 04:13:43 Re: BUG #5015: MySQL migration wizard does not start