Re: random system table corruption ...

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, eg(at)cybertec(dot)at
Subject: Re: random system table corruption ...
Date: 2005-09-11 11:57:35
Message-ID: 20050911115730.GD10980@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 11, 2005 at 01:12:34PM +0200, Hans-Jürgen Schönig wrote:
> in the past we have faced a couple of problems with corrupted system
> tables. this seems to be a version independent problem which occurs on
> hackers' from time to time.
> i have checked a broken file and i have seen that the corrupted page has
> actually been zeroed out.

Near as I can tell, the only times pages are zeroed out is if
zero_damaged_pages is set (destroying the evidence) or during WAL
recovery.

> my question is: are there any options to implement something which makes
> system tables more robust? the problem is: the described error happens
> only once i an while and cannot be reproduced. maybe there is a way to
> add some more sanity checks before the page is actually written.

Well, the most common causes are dodgy memory. Other than that I guess
you could arrange for bgwriter to check the pages it is writing. I
imagine it already does check the header, checking the data requires
knowledge about the actual table and attributes. And about the only
thing that says "I'm broken" is a varlena value with a long value.

As they say, the only thing sure would be to have a backup. the only
thing I can imagine being really useful would be a restore mode where
you feed it the schema so it can reconstruct the pg_class and
pg_attribute just enough for you to dump it to reconstruct
everything...

You know, VACUUM FREEZE BACKUP on pg_catalog, physically copy the
datafiles and offer the option to blat your catalog with an old one...
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Ehmsen 2005-09-11 13:47:45 Benchmarks?
Previous Message Hans-Jürgen Schönig 2005-09-11 11:12:34 random system table corruption ...