Re: New CRC algorithm: Slicing by 8

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: <mark(at)mark(dot)mielke(dot)cc>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gurjeet Singh" <singh(dot)gurjeet(at)gmail(dot)com>, "PGSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New CRC algorithm: Slicing by 8
Date: 2006-10-24 17:31:03
Message-ID: 1161711064.3861.235.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2006-10-24 at 12:47 -0400, mark(at)mark(dot)mielke(dot)cc wrote:
> On Tue, Oct 24, 2006 at 05:05:58PM +0100, Simon Riggs wrote:
> > On Tue, 2006-10-24 at 10:18 -0400, mark(at)mark(dot)mielke(dot)cc wrote:
> > > On Tue, Oct 24, 2006 at 02:52:36PM +0100, Simon Riggs wrote:
> > > > On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
> > > > > No, because unlike tuples, WAL records can and do cross page boundaries.
> > > > But not that often, with full_page_writes = off. So we could get away
> > > > with just CRC checking the page-spanning ones and mark the records to
> > > > show whether they have been CRC checked or not and need to be rechecked
> > > > at recovery time. That would reduce the CRC overhead to about 1-5% of
> > > > what it is now (as an option).
> > >
> > > WAL pages 8 Kbytes, and disk pages 512 bytes, correct? I don't see a
> > > guarantee in here that the 8 Kbytes worth of data will be written as
> > > sequential writes, nor that the 8 Kbytes of data will necessarily
> > > finish.
> > >
> > > If the operating system uses 8 Kbyte pages, or the RAID system uses 8
> > > Kbytes or larger chunks, and they guarantee sequential writes, perhaps
> > > it is ok. Still, if the power goes out after writing the first 512
> > > bytes, 2048 bytes, or 4096 bytes, then what? With RAID involved it
> > > might get better or worse, depending on the RAID configuration.
> >
> > That is the torn-page problem. If your system doesn't already protect
> > you against this you have no business turning off full_page_writes,
> > which was one of my starting assumptions.
>
> I wasn't aware that a system could protect against this. :-)
>
> I write 8 Kbytes - how can I guarantee that the underlying disk writes
> all 8 Kbytes before it loses power? And why isn't the CRC a valid means
> of dealing with this? :-)
>
> I'm on wrong on one of these assumptions, I'm open to being educated.
> My opinion as of a few seconds ago, is that a write to a single disk
> sector is safe, but that a write that extends across several sectors
> is not.

http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html

full_page_writes = off

I'm very happy to learn more myself...

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2006-10-24 17:46:23 Re: Incorrect behavior with CE and ORDER BY
Previous Message Tom Lane 2006-10-24 17:08:28 Re: Incorrect behavior with CE and ORDER BY