* Simon Riggs <simon(at)2ndQuadrant(dot)com> [091130 16:28]:
> You've written that as if you are spotting a problem. It sounds to me
> that this is exactly the situation we would like to detect and this is a
> perfect way of doing that.
> What do you see is the purpose here apart from spotting corruptions?
> Do we think error rates are so low we can recover the corruption by
> doing something clever with the CRC? I envisage most corruptions as
> being unrecoverable except from backup/WAL/replicated servers.
> It's been a long day, so perhaps I've misunderstood.
No, I believe the torn-page problem is exactly the thing that made the
checksum talks stall out last time... The torn page isn't currently a
problem on only-hint-bit-dirty writes, because if you get
half-old/half-new, the only changes is the hint bit - no big loss, the
data is still the same.
But, with a form of check-sums, when you read it it next time, is it
corrupt? According to the check-sum, yes, but in reality, the *data* is
still valid, just that the check sum is/isn't correctly matching the
half-changed hint bits...
And then many not-so-really-attractive workarounds where thrown around,
with nothing nice falling into place...
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.
In response to
pgsql-hackers by date
|Next:||From: Dimitri Fontaine||Date: 2009-11-30 21:54:54|
|Subject: Re: Application name patch - v4|
|Previous:||From: Craig Ringer||Date: 2009-11-30 21:45:09|
|Subject: Re: draft RFC: concept for partial, wal-based replication|