| From: | Greg Smith <greg(at)2ndquadrant(dot)com> | 
|---|---|
| To: | Bruce Momjian <bruce(at)momjian(dot)us> | 
| Cc: | Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org> | 
| Subject: | Re: SSD + RAID | 
| Date: | 2010-03-02 06:13:29 | 
| Message-ID: | 4B8CAC89.1020100@2ndquadrant.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance | 
Bruce Momjian wrote:
> I always assumed SCSI disks had a write-through cache and therefore
> didn't need a drive cache flush comment.
>   
There's more detail on all this mess at 
http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks and it includes 
this perception, which I've recently come to believe isn't actually 
correct anymore.  Like the IDE crowd, it looks like one day somebody 
said "hey, we lose every write heavy benchmark badly because we only 
have a write-through cache", and that principle got lost along the 
wayside.  What has been true, and I'm staring to think this is what 
we've all been observing rather than a write-through cache, is that the 
proper cache flushing commands have been there in working form for so 
much longer that it's more likely your SCSI driver and drive do the 
right thing if the filesystem asks them to.  SCSI SYNCHRONIZE CACHE has 
a much longer and prouder history than IDE's FLUSH_CACHE and SATA's 
FLUSH_CACHE_EXT.
It's also worth noting that many current SAS drives, the current SCSI 
incarnation, are basically SATA drives with a bridge chipset stuck onto 
them, or with just the interface board swapped out.  This one reason why 
top-end SAS capacities lag behind consumer SATA drives.  They use the 
consumers as beta testers to get the really fundamental firmware issues 
sorted out, and once things are stable they start stamping out the 
version with the SAS interface instead.  (Note that there's a parallel 
manufacturing approach that makes much smaller SAS drives, the 2.5" 
server models or those at higher RPMs, that doesn't go through this 
path.  Those are also the really expensive models, due to economy of 
scale issues).  The idea that these would have fundamentally different 
write cache behavior doesn't really follow from that development model.
At this point, there are only two common differences between "consumer" 
and "enterprise" hard drives of the same size and RPM when there are 
directly matching ones:
1) You might get SAS instead of SATA as the interface, which provides 
the more mature command set I was talking about above--and therefore may 
give you a sane write-back cache with proper flushing, which is all the 
database really expects.
2) The timeouts when there's a read/write problem are tuned down in the 
enterprise version, to be more compatible with RAID setups where you 
want to push the drive off-line when this happens rather than presuming 
you can fix it.  Consumers would prefer that the drive spent a lot of 
time doing heroics to try and save their sole copy of the apparently 
missing data.
You might get a slightly higher grade of parts if you're lucky too; I 
wouldn't count on it though.  That seems to be saved for the high RPM or 
smaller size drives only.
-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com   www.2ndQuadrant.us
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pierre C | 2010-03-02 08:36:48 | Re: SSD + RAID | 
| Previous Message | Bruce Momjian | 2010-03-02 03:34:57 | Re: SSD + RAID |