Skip site navigation (1) Skip section navigation (2)

Re: Block-level CRC checks

From: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Brian Hurt <bhurt(at)janestcapital(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2008-10-02 17:36:23
Message-ID: BB469073-64E8-4CC8-A8E3-0672242D5347@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers

On 2 Oct 2008, at 05:51 PM, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:

> So if PG currently doesn't care about the hit-bits being updated,  
> during
> the write, then why should introducing a double-buffer introduce the a
> torn-page problem Tom mentions?  I admit, I'm fishing for information
> from those in the know, because I haven't been looking at the code  
> long
> enough (or all of it enough) to to know all the ins-and-outs...

It's not the buffeting it's the checksum. The problem arises if a page  
is read in but no wal logged modifications are done against it. If a  
hint bit is modified it won't be wal logged but the page is marked  
dirty.

When we write the page there's a chance only part of the page actually  
makes it to disk if the system crashes before the whole page is flushed.

Wal logged changes are safe because of full_page_writes. Hint bits are  
safe because either the old or the new value will be on disk and we  
don't care which. It doesn't matter if some hint bits are set and some  
aren't.

However the checksum won't match because the checksum will have been  
calculated on the whole block and part of it was never written.


Writing this explanation did bring to mind one solution which we had  
already discussed for other reasons: not marking blocks dirty after  
hint bit setting.

Alternatively if we detect a block is dirty but the lsn is older than  
the last checkpoint is that the only time we need to worry? Then we  
could either discard the writes or generate a noop wal log record just  
for the full page write in that case. 
   

In response to

Responses

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2008-10-02 17:38:06
Subject: Re: Block-level CRC checks
Previous:From: Jonah H. HarrisDate: 2008-10-02 17:31:02
Subject: Re: Block-level CRC checks

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group