Re: How to improve db performance with $7K?

From: Alan Stange <stange(at)rentec(dot)com>
To: Alex Turner <armtuk(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, William Yu <wyu(at)talisys(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: How to improve db performance with $7K?
Date: 2005-04-18 17:34:28
Message-ID: 4263EFA4.2000900@rentec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alex Turner wrote:

>[snip]
>
>
>>Adding drives will not let you get lower response times than the average seek
>>time on your drives*. But it will let you reach that response time more often.
>>
>>
>>
>[snip]
>
>I believe your assertion is fundamentaly flawed. Adding more drives
>will not let you reach that response time more often. All drives are
>required to fill every request in all RAID levels (except possibly
>0+1, but that isn't used for enterprise applicaitons). Most requests
>in OLTP require most of the request time to seek, not to read. Only
>in single large block data transfers will you get any benefit from
>adding more drives, which is atypical in most database applications.
>For most database applications, the only way to increase
>transactions/sec is to decrease request service time, which is
>generaly achieved with better seek times or a better controller card,
>or possibly spreading your database accross multiple tablespaces on
>seperate paritions.
>
>My assertion therefore is that simply adding more drives to an already
>competent* configuration is about as likely to increase your database
>effectiveness as swiss cheese is to make your car run faster.
>
>

Consider the case of a mirrored file system with a mostly read()
workload. Typical behavior is to use a round-robin method for issueing
the read operations to each mirror in turn, but one can use other
methods like a geometric algorithm that will issue the reads to the
drive with the head located closest to the desired track. Some
systems have many mirrors of the data for exactly this behavior. In
fact, one can carry this logic to the extreme and have one drive for
every cylinder in the mirror, thus removing seek latencies completely.
In fact this extreme case would also remove the rotational latency as
the cylinder will be in the disks read cache. :-) Of course, writing
data would be a bit slow!

I'm not sure I understand your assertion that "all drives are required
to fill every request in all RAID levels". After all, in mirrored
reads only one mirror needs to read any given block of data, so I don't
know what goal is achieved in making other mirrors read the same data.

My assertion (based on ample personal experience) is that one can
*always* get improved performance by adding more drives. Just limit the
drives to use the first few cylinders so that the average seek time is
greatly reduced and concatenate the drives together. One can then build
the usual RAID device out of these concatenated metadevices. Yes, one
is wasting lots of disk space, but that's life. If your goal is
performance, then you need to put your money on the table. The
system will be somewhat unreliable because of the device count,
additional SCSI buses, etc., but that too is life in the high
performance world.

-- Alan

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jacques Caron 2005-04-18 17:41:49 Re: How to improve db performance with $7K?
Previous Message Tom Lane 2005-04-18 17:32:48 Re: RES: How to improve postgres performace