On Fri, Oct 3, 2008 at 3:36 PM, Brian Hurt <bhurt(at)janestcapital(dot)com> wrote:
> OK, I have a stupid question- torn pages are a problem, but only during
> recovery. Recovery is (I assume) a fairly rare condition- if data
> corruption is going to happen, it's most likely to happen during normal
> operation. So why not just turn off CRC checksumming during recovery, or at
> least treat it as a much less critical error? During recovery, if the CRC
> checksum matches, we can assume the page is good- not only not corrupt, but
> not torn either. If the CRC checksum doesn't match, we don't panic, but
> maybe we do more careful analysis of the page to make sure that only the
> hint bits are wrong. Or maybe not. It's only during normal operation that
> a CRC checksum failure would be considered critical.
Well:
1. database half-writes the page X to disk, and there is power outage.
2. we regain the power
2. during recovery database replay all WAL-logged pages. The X page
was not WAL-logged, thus it is not replayed.
3. when replaying is finished, everything looks OK at this point
4. user runs a SELECT which hits page X. Oops, we have a checksum
mismatch.
Best regards,
Dawid Kuroczko
--
.................. ``The essence of real creativity is a certain
: *Dawid Kuroczko* : playfulness, a flitting from idea to idea
: qnex42(at)gmail(dot)com : without getting bogged down by fixated demands.''
`..................' Sherkaner Underhill, A Deepness in the Sky, V. Vinge
In response to
pgsql-hackers by date
| Next: | From: Bruce Momjian | Date: 2008-10-03 14:32:33 |
| Subject: Re: Block-level CRC checks |
| Previous: | From: Brian Hurt | Date: 2008-10-03 13:36:19 |
| Subject: Re: Block-level CRC checks |