From: | mark(at)mark(dot)mielke(dot)cc |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: New CRC algorithm: Slicing by 8 |
Date: | 2006-10-24 16:47:09 |
Message-ID: | 20061024164709.GB9658@mark.mielke.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 24, 2006 at 05:05:58PM +0100, Simon Riggs wrote:
> On Tue, 2006-10-24 at 10:18 -0400, mark(at)mark(dot)mielke(dot)cc wrote:
> > On Tue, Oct 24, 2006 at 02:52:36PM +0100, Simon Riggs wrote:
> > > On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
> > > > No, because unlike tuples, WAL records can and do cross page boundaries.
> > > But not that often, with full_page_writes = off. So we could get away
> > > with just CRC checking the page-spanning ones and mark the records to
> > > show whether they have been CRC checked or not and need to be rechecked
> > > at recovery time. That would reduce the CRC overhead to about 1-5% of
> > > what it is now (as an option).
> >
> > WAL pages 8 Kbytes, and disk pages 512 bytes, correct? I don't see a
> > guarantee in here that the 8 Kbytes worth of data will be written as
> > sequential writes, nor that the 8 Kbytes of data will necessarily
> > finish.
> >
> > If the operating system uses 8 Kbyte pages, or the RAID system uses 8
> > Kbytes or larger chunks, and they guarantee sequential writes, perhaps
> > it is ok. Still, if the power goes out after writing the first 512
> > bytes, 2048 bytes, or 4096 bytes, then what? With RAID involved it
> > might get better or worse, depending on the RAID configuration.
>
> That is the torn-page problem. If your system doesn't already protect
> you against this you have no business turning off full_page_writes,
> which was one of my starting assumptions.
I wasn't aware that a system could protect against this. :-)
I write 8 Kbytes - how can I guarantee that the underlying disk writes
all 8 Kbytes before it loses power? And why isn't the CRC a valid means
of dealing with this? :-)
I'm on wrong on one of these assumptions, I'm open to being educated.
My opinion as of a few seconds ago, is that a write to a single disk
sector is safe, but that a write that extends across several sectors
is not.
Cheers,
mark
--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada
One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-10-24 17:08:28 | Re: Incorrect behavior with CE and ORDER BY |
Previous Message | Joshua D. Drake | 2006-10-24 16:33:21 | Incorrect behavior with CE and ORDER BY |