Re: pg_internal.init is hazardous to your health

From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_internal.init is hazardous to your health
Date: 2006-10-18 02:49:23
Message-ID: Pine.LNX.4.58.0610181247110.13682@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 17 Oct 2006, Tom Lane wrote:

> Dirk Lutzebaeck and I just spent a tense couple of hours trying to
> figure out why a large database Down Under wasn't coming up after being
> reloaded from a base backup plus PITR recovery. The symptoms were that
> the recovery went fine, but backend processes would fail at startup or
> soon after with "could not open relation XX/XX/XX: No such file" type of
> errors.
>
> The answer that ultimately emerged was that they'd been running a
> nightly maintenance script that did REINDEX SYSTEM (among other things
> I suppose). The PITR base backup included pg_internal.init files that
> were appropriate when it was taken, and the PITR recovery process did
> nothing whatsoever to update 'em :-(. So incoming backends picked up
> init files with obsolete relfilenode values.

Ouch.

> We don't actually need to *update* the file, per se, we only need to
> remove it if no longer valid --- the next incoming backend will rebuild
> it. I could see fixing this by making WAL recovery run around and zap
> all the .init files (only problem is to find 'em), or we could add a new
> kind of WAL record saying "remove the .init file for database XYZ"
> to be emitted whenever someone removes the active one. Thoughts?

The latter seems the Right Way except, I guess, that the decision to
remove the file is buried deep inside inval.c.

Thanks,

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-10-18 02:55:29 Re: Hints proposal
Previous Message Robert Treat 2006-10-18 02:40:19 Re: Hints proposal