Re: Perl script failure => Postgres 7.1.2 database corruption

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Frank McKenney <frank_mckenney(at)mindspring(dot)com>
Cc: "PostgreSQL Bug List" <pgsql-bugs(at)postgresql(dot)org>, Ethan Burnside <support(at)kattare(dot)com>
Subject: Re: Perl script failure => Postgres 7.1.2 database corruption
Date: 2001-11-09 17:36:38
Message-ID: 20769.1005327398@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Frank McKenney <frank_mckenney(at)mindspring(dot)com> writes:
> 1) Are there circumstances under which a "space exceeded" error on
> a client machine _can_ damage a database on the Postgres server
> machine?

I don't see how. The error messages you cite all point to the idea
that there was some internal corruption in the database. I'd venture
more than one corrupted block, in fact. It appears that block 1444
of your summary table's toast relation was clobbered (probably zeroed
out in whole or in part), and the "relation nnnnn does not exist"
complaints look like some bad things had happened to one or more system
tables as well.

Unfortunately, since you deleted the database, all the evidence is
gone and there's no longer much hope of learning any more. If something
like this happens again, it might be worth tar'ing up the $PGDATA
tree (while the postmaster is stopped) for possible forensic analysis.

> 3) What other things could we have tried to recover this situation?

I think you were pretty much out of luck on that database, though
perhaps partial data recovery could have been made if you were willing
to spend time on it. A more interesting thing to worry about is how to
ensure it doesn't happen again, and here my advice would be to look at
the reliability of your disk drives and I/O hardware. I've seen more
than one report of mysterious data clobbers that eventually traced to
bogus disk controllers, flaky RAM, etc. In particular, I recall
several data-block-suddenly-became-zero failures with hardware origins,
and none that traced to software problems...

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2001-11-09 19:06:16 Re: Perl script failure => Postgres 7.1.2 database corruption
Previous Message Tom Lane 2001-11-09 17:16:39 Re: Bug #512: outer join bug