From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Block-level CRC checks |
Date: | 2009-12-01 12:08:21 |
Message-ID: | 200912011208.nB1C8LD18864@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas wrote:
> On Mon, Nov 30, 2009 at 3:27 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> > Simon Riggs wrote:
> >> Proposal
> >>
> >> * We reserve enough space on a disk block for a CRC check. When a dirty
> >> block is written to disk we calculate and annotate the CRC value, though
> >> this is *not* WAL logged.
> >
> > Imagine this:
> > 1. A hint bit is set. It is not WAL-logged, but the page is dirtied.
> > 2. The buffer is flushed out of the buffer cache to the OS. A new CRC is
> > calculated and stored on the page.
> > 3. Half of the page is flushed to disk (aka torn page problem). The CRC
> > made it to disk but the flipped hint bit didn't.
> >
> > You now have a page with incorrect CRC on disk.
>
> This is probably a stupid question, but why doesn't the other half of
> the page make it to disk? Somebody pulls the plug first?
Yep, the pages are 512 bytes on disk, so you might get only some of the
16 512-byte blocks to disk, or the 512-byte block might be partially
written. Full page writes fix these on recovery.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-12-01 12:38:37 | Re: Block-level CRC checks |
Previous Message | Bruce Momjian | 2009-12-01 12:07:25 | Re: CommitFest status/management |