Re: BBU Cache vs. spindles

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: James Mansion <james(at)mansionfamily(dot)plus(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Bruce Momjian <bruce(at)momjian(dot)us>, jd(at)commandprompt(dot)com, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-performance(at)postgresql(dot)org, Ben Chobot <bench(at)silentmedia(dot)com>
Subject: Re: BBU Cache vs. spindles
Date: 2010-10-24 16:53:13
Message-ID: 4CC46479.2050206@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance pgsql-www

James Mansion wrote:
> When I looked at the internals of TokyoCabinet for example, the design
> was flawed but
> would be 'fairly robust' so long as mmap'd pages that were dirtied did
> not get persisted
> until msync, and were then persisted atomically.

If TokyoCabinet presumes that's true and overwrites existing blocks with
that assumption, it would land onto my list of databases I wouldn't
trust to hold my TODO list. Flip off power to a server, and you have no
idea what portion of the blocks sitting in the drive's cache actually
made it to disk; that's not even guaranteed atomic to the byte level.
Torn pages happen all the time unless you either a) put the entire write
into a non-volatile cache before writing any of it, b) write and sync
somewhere else first and then do a journaled filesystem pointer swap
from the old page to the new one, or c) journal the whole write the way
PostgreSQL does with full_page_writes and the WAL. The discussion here
veered off over whether (a) was sufficiently satisfied just by having a
RAID controller with battery backup, and what I concluded from the dive
into the details is that it's definitely not true unless the filesystem
block size exactly matches the database one. And even then, make sure
you test heavily.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2010-10-24 17:04:45 Re: BBU Cache vs. spindles
Previous Message James Mansion 2010-10-24 08:05:19 Re: BBU Cache vs. spindles

Browse pgsql-www by date

  From Date Subject
Next Message Greg Smith 2010-10-24 17:04:45 Re: BBU Cache vs. spindles
Previous Message James Mansion 2010-10-24 08:05:19 Re: BBU Cache vs. spindles