Skip site navigation (1) Skip section navigation (2)

Re: Tables on multiple disk drives

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Craig Thomas <craiger(at)osdl(dot)org>
Cc: <lists2(at)tokar(dot)ru>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Tables on multiple disk drives
Date: 2004-02-17 17:53:04
Message-ID: Pine.LNX.4.33.0402171045480.30812-100000@css120.ihs.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Tue, 17 Feb 2004, Craig Thomas wrote:

> > On Tue, 17 Feb 2004, Konstantin Tokar wrote:
> >
> >> Hi!
> >> Does PostgreSQL allow to create tables and indices of a single
> >> database on multiple disk drives with a purpose of increase
> >> performance as Oracle database does? If a symbolic reference is the
> >> only method then the next question is: how can it be determined what
> >> file is referred to what table and index?
> >
> > You're life will be simpler, and your setup will be faster without
> > having  to muck about with it, if you just buy a good RAID controller
> > with battery  backed cache.  LSI/Megaraid and Adaptec both make
> > serviceable controllers  for reasonable prices, and as you add drives,
> > the speed just goes up, no  muddling around with sym links.
> 
> This works to a limited extent.  For very large databases, maximum
> throughput of I/O is the paramount factor for database performance.  With
> raid controllers, your LUN is still limited to a small number of disks.
> PostgreSQL can only write on a file system, but Oracle, SAP DB, DB2, etc
> can write directly to disk (raw I/O).  With large databases it is
> advantageous to spread a table across 100's of disks, if the table is
> quite large.  I don't know of any manufacturer that creates a 100 disk
> raid array yet.

You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI 
interfaces, and they act as one unit.  That's 3*4*15 = 180 disks max.

With FC AL connections and four cards, it would be possible to approach 
1000 drives.  

Of course, I'm not sure how fast any RAID card setup is gonna be with that 
many drives, but ya never know.  My guess is that before you go there you 
buy a big external RAID box built for speed.  We have a couple of 200+ 
drive external RAID5 storage boxes at work that are quite impressive.

> Some of the problem can be addressed by using a volume manager (such as
> LVM in Linux, or Veritas on Unix-like systems).  This allows one to
> create a volume using partitions from many disks.  One can then create
> a file system and mount it on the volume.

Pretty much RAID arrays in software, which means no battery backed cache, 
which means it'll be fast at reading, but probably pretty slow at writes, 
epsecially if there's a lot of parallel access waiting to write to the 
database.

> However, to get the best performance, Raw I/O capability is the best
> way to go.

Unsupported statement made as fact.  I'm not saying it can't or isn't
true, but my experience has been that large RAID5 arrays are a great 
compromise between maximum performance and reliability, giving a good 
measure of each.  It doesn't take 100 drives to do well, even a dozen to 
two dozen will get you in the same basic range as splitting out files by 
hand with sym links without all the headache of chasing down the files, 
shutting down the database, linking it over etc...



In response to

Responses

pgsql-performance by date

Next:From: Craig ThomasDate: 2004-02-17 18:16:22
Subject: Re: Tables on multiple disk drives
Previous:From: Craig ThomasDate: 2004-02-17 17:35:38
Subject: Re: Tables on multiple disk drives

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group