Re: Block-level CRC checks

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2008-11-13 19:25:40
Message-ID: 20081113192540.GE4062@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Aidan Van Dyk wrote:
>
> I think I'm missing something...
>
> In this patch, I see you writing WAL records for hint-bits (bufmgr.c
> FlushBuffer). But doesn't XLogInsert then make a "backup block" record (unless
> it's already got one since last checkpoint)?

I'm not causing a backup block to be written with that WAL record. The
rationale is that it's not needed -- if there was a critical write to
the page, then there's already a backup block. If the only write was a
hint bit being set, then the page cannot possibly be torn.

Now that I think about this, I wonder if this can cause problems in some
filesystems. XFS, for example, zeroes out during recovery any block
that was written to but not fsync'ed before a crash. This means that if
we change a hint bit after a checkpoing and mark the page dirty, the
system can write the page. Suppose we crash at this point. On
recovery, XFS will zero out the block, but there will be nothing with
which to recovery it, because there's no backup block ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-11-13 19:31:00 Re: auto_explain contrib moudle
Previous Message Alvaro Herrera 2008-11-13 19:20:04 Re: Block-level CRC checks