Re: Two hard drives --- what to do with them?

From: Shane Ambler <pgsql(at)Sheeky(dot)Biz>
To: Peter Kovacs <maxottovonstirlitz(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Carlos Moreno <moreno_pg(at)mochima(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Two hard drives --- what to do with them?
Date: 2007-02-27 18:51:41
Message-ID: 45E47DBD.6090803@Sheeky.Biz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Peter Kovacs wrote:

>> > The reason this becomes an issue is that many consumer-grade disks have
>> > write cache enabled by default and no way to make sure the cached data
>> > actually gets written. So, essentially, these disks "lie" and say they
>> > wrote the data, when in reality, it's in volatile memory. It's
>> > recommended that you disable write cache on such a device.
>>
>> From all that I have heard this is another advantage of SCSI disks -
>> they honor these settings as you would expect - many IDE/SATA disks
>> often say "sure I'll disable the cache" but continue to use it or don't
>> retain the setting after restart.
>
> As far as I know, SCSI drives also have "write cache" which is turned
> off by default, but can be turned on (e.g. with the sdparm utility on
> Linux). The reason I am so much interested in how write cache is
> typically used (on or off) is that I recently ran our benchmarks on a
> machine with SCSI disks and those benchmarks with high commit ratio
> suffered significantly compared to our previous results
> "traditionally" obtained on machines with IDE drives.

Most likely - with write cache, when the drive gets the data it puts it
into cache and then says "yep all done" and you continue on as it puts
it on the disk. But if the power goes out as it's doing that you got
trouble.

The difference between SCSI and IDE/SATA in this case is a lot if not
all IDE/SATA drives tell you that the cache is disabled when you ask it
to but they either don't actually disable it or they don't retain the
setting so you get caught later. SCSI disks can be trusted when you set
this option.

> I wonder if running a machine on a UPS + 1 hot standby internal PS is
> equivalent, in terms of data integrity, to using battery backed write
> cache. Instinctively, I'd think that UPS + 1 hot standby internal PS
> is better, since this setup also provides for the disk to actually
> write out the content of the cache -- as you pointed out.
>

This is covering two different scenarios.
The UPS maintains power in the event of a black out.
The hot standby internal PS maintains power when the first PS dies.

It is a good choice to have both as a PS dying will be just as bad as
losing power without a UPS and the UPS won't save you if the PS goes.

A battery backed raid card sits in between these - as long as the
drive's write cache is off - the raid card will hold data that was sent
to disk until it confirms it is written to disk. The battery backup will
even hold that data until the machine is switched back on when it
completes the writing to disk. That would cover you even if the PS goes.

--

Shane Ambler
pgSQL(at)Sheeky(dot)Biz

Get Sheeky @ http://Sheeky.Biz

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-02-27 18:54:30 Re: [kris@obsecurity.org: Anyone interested in improving postgresql scaling?]
Previous Message Jim C. Nasby 2007-02-27 18:32:20 [kris@obsecurity.org: Anyone interested in improving postgresql scaling?]