Robert Haas wrote:
> On Mon, Nov 30, 2009 at 3:27 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> > Simon Riggs wrote:
> >> Proposal
> >> * We reserve enough space on a disk block for a CRC check. When a dirty
> >> block is written to disk we calculate and annotate the CRC value, though
> >> this is *not* WAL logged.
> > Imagine this:
> > 1. A hint bit is set. It is not WAL-logged, but the page is dirtied.
> > 2. The buffer is flushed out of the buffer cache to the OS. A new CRC is
> > calculated and stored on the page.
> > 3. Half of the page is flushed to disk (aka torn page problem). The CRC
> > made it to disk but the flipped hint bit didn't.
> > You now have a page with incorrect CRC on disk.
> This is probably a stupid question, but why doesn't the other half of
> the page make it to disk? Somebody pulls the plug first?
Yep, the pages are 512 bytes on disk, so you might get only some of the
16 512-byte blocks to disk, or the 512-byte block might be partially
written. Full page writes fix these on recovery.
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
+ If your life is a hard drive, Christ can be your backup. +
In response to
pgsql-hackers by date
|Next:||From: Simon Riggs||Date: 2009-12-01 12:38:37|
|Subject: Re: Block-level CRC checks|
|Previous:||From: Bruce Momjian||Date: 2009-12-01 12:07:25|
|Subject: Re: CommitFest status/management|