Re: beta testing version

From: ncm(at)zembu(dot)com (Nathan Myers)
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta testing version
Date: 2000-12-01 01:15:29
Message-ID: 20001130171529.P22345@store.zembu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 30, 2000 at 05:37:58PM -0800, Mitch Vincent wrote:
> > > No, WAL does help, cause you can then pull in your last dump and recover
> > > up to the moment that power cable was pulled out of the wall ...
> >
> > False, on so many counts I can't list them all.
>
> Why? If we're not talking hardware damage and you have a dump made
> sometime previous to the crash, why wouldn't that work to restore the
> database? I've had to restore a corrupted database from a dump before,
> there wasn't any hardware damage, the database (more specifically the
> indexes) were corrupted. Of course WAL wasn't around but I don't see
> why this wouldn't work...

I posted a more detailed explanation a few minutes ago, but
it appears to have been eaten by the mailing list server.

I won't re-post the explanations that you all have seen over the
last two days, about disk behavior during a power outage; they're
in the archives (I assume -- when last I checked, web access to it
didn't work). Suffice to say that if you pull the plug, there is
just too much about the state of the disks that is unknown.

As for replaying logs against a restored snapshot dump... AIUI, a
dump records tuples by OID, but the WAL refers to TIDs. Therefore,
the WAL won't work as a re-do log to recover your transactions
because the TIDs of the restored tables are all different.

To get replaying we need an "update log", something that might be
in 7.2 if somebody does a lot of work.

> Note I'm not saying you're wrong, just asking that you explain your
> comment a little more. If WAL can't be used to help recover from
> crashes where database corruption occurs, what good is it?

The WAL is a performance optimization for the current recovery
capabilities, which assume uncorrupted table files. It protects
against those database server crashes that happen not to corrupt
the table files (i.e. most). It doesn't protect against corruption
of the tables, by bugs in PG or in the OS or from "hardware events".
It also doesn't protect against OS crashes that result in
write-buffered sectors not having been written before the crash.
Practically, this means that WAL file entries older than a few
seconds are not useful for much.

In general, it's foolish to expect a single system to store very
valuable data with much confidence. To get full recoverability,
you need a "hot failover" system duplicating your transactions in
real time. (Even then, you're vulnerable to application-level
mistakes.)

Nathan Myers
ncm(at)zembu(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mitch Vincent 2000-12-01 01:37:58 Re: beta testing version
Previous Message Vince Vielhaber 2000-12-01 00:44:53 Re: beta testing version