pg_internal.init is hazardous to your health

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: pg_internal.init is hazardous to your health
Date: 2006-10-18 02:29:13
Message-ID: 14353.1161138553@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dirk Lutzebaeck and I just spent a tense couple of hours trying to
figure out why a large database Down Under wasn't coming up after being
reloaded from a base backup plus PITR recovery. The symptoms were that
the recovery went fine, but backend processes would fail at startup or
soon after with "could not open relation XX/XX/XX: No such file" type of
errors.

The answer that ultimately emerged was that they'd been running a
nightly maintenance script that did REINDEX SYSTEM (among other things
I suppose). The PITR base backup included pg_internal.init files that
were appropriate when it was taken, and the PITR recovery process did
nothing whatsoever to update 'em :-(. So incoming backends picked up
init files with obsolete relfilenode values.

We don't actually need to *update* the file, per se, we only need to
remove it if no longer valid --- the next incoming backend will rebuild
it. I could see fixing this by making WAL recovery run around and zap
all the .init files (only problem is to find 'em), or we could add a new
kind of WAL record saying "remove the .init file for database XYZ"
to be emitted whenever someone removes the active one. Thoughts?

Meanwhile, if you're trying to recover from a PITR backup and it's not
working, try removing any pg_internal.init files you can find.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2006-10-18 02:40:19 Re: Hints proposal
Previous Message Robert Treat 2006-10-18 02:18:52 Re: [HACKERS] Hints proposal