Re: performance on new linux box

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: performance on new linux box
Date: 2010-07-16 11:36:10
Message-ID: 4C40442A.4030806@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 16/07/10 09:22, Scott Marlowe wrote:
> On Thu, Jul 15, 2010 at 10:30 AM, Scott Carey <scott(at)richrelevance(dot)com> wrote:
>>
>> On Jul 14, 2010, at 7:50 PM, Ben Chobot wrote:
>>
>>> On Jul 14, 2010, at 6:57 PM, Scott Carey wrote:
>>>
>>>> But none of this explains why a 4-disk raid 10 is slower than a 1 disk system. If there is no write-back caching on the RAID, it should still be similar to the one disk setup.
>>>
>>> Many raid controllers are smart enough to always turn off write caching on the drives, and also disable the feature on their own buffer without a BBU. Add a BBU, and the cache on the controller starts getting used, but *not* the cache on the drives.
>>
>> This does not make sense.
>
> Basically, you can have cheap, fast and dangerous (drive with write
> cache enabled, which responds positively to fsync even when it hasn't
> actually fsynced the data. You can have cheap, slow and safe with a
> drive that has a cache but since it'll be fsyncing it all the the time
> the write cache won't actually get used, or fast, expensive, and safe,
> which is what a BBU RAID card gets by saying the data is fsynced when
> it's actually just in cache, but a safe cache that won't get lost on
> power down.

Speaking of BBUs... do you ever find yourself wishing you could use
software RAID with battery backup?

I tend to use software RAID quite heavily on non-database servers, as
it's cheap, fast, portable from machine to machine, and (in the case of
Linux 'md' raid) reliable. Alas, I can't really use it for DB servers
due to the need for write-back caching.

There's no technical reason I know of why sw raid couldn't write-cache
to some non-volatile memory on the host. A dedicated a battery-backed
pair of DIMMS on a PCI-E card mapped into memory would be ideal. Failing
that, a PCI-E card with onboard RAM+BATT or fast flash that presents an
AHCI interface so it can be used as a virtual HDD would do pretty well.
Even one of those SATA "RAM Drive" units would do the job, though
forcing everything though the SATA2 bus would be a performance downside.

The only issue I see with sw raid write caching is that it probably
couldn't be done safely on the root file system. The OS would have to
come up, init software raid, and find the caches before it'd be safe to
read or write volumes with s/w raid write caching enabled. It's not the
sort of thing that'd be practical to implement in GRUB's raid support.

--
Craig Ringer

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Igor Neyman 2010-07-16 14:10:40 Re: Identical query slower on 8.4 vs 8.3
Previous Message Craig Ringer 2010-07-16 11:17:53 Re: performance on new linux box