Re: Tuning to speed select

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Tom Laudeman" <twl8n(at)virginia(dot)edu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Tuning to speed select
Date: 2006-08-11 14:41:56
Message-ID: b42b73150608110741i742d4ab1v8f1d6e4fa169c33d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 8/11/06, Tom Laudeman <twl8n(at)virginia(dot)edu> wrote:
> Merlin,
> The problem is disk. I've got a WD Caviar. hdparm says it does 44MB/sec
> (I ran that in single user mode so there was nothing interfering). A WD
> Caviar SE SATA in one of my servers at home gets 56MB/sec on a quiescent
> system at runlevel 3. What kind of values does hdparm give for a SATA
> Raptor?

i dont have one handy on a linux box to test, but all raptors are 10k
drives with 16 mb cache. they feel about twice as fast as 7200 rpm ide
drives in general use.

> I think my Dell Precision 650 has SATA on the motherboard. The boss says
> I can order one drive, so what should I get? How much faster is RAID 0+1
> than a single drive?

depends on a lot of factors, near 100% improvement is realistic even
with software raid. I would tell your boss that you could buy 2 36g
raptors (110$ each) do a simple raid 0. just be aware that either
drive failing will take you out. or, you could do raid 1 for
redundancy.

> Aside from size, I can't see much difference between these drives (WD
> Raptors at NewEgg):
> http://www.newegg.com/Product/ProductList.asp?DEPA=0&type=&Description=raptor&Submit=ENE&Ntk=all&N=0&minPrice=&maxPrice=&Go.x=0&Go.y=0

the retail parts are more expensive as is the silly drive that you can
look into. buy the cheapest part at the size level you need.

> CLUSTER certainly helped. Each of the following queries would have
> returned roughly 50,000 records. Note that selecting a single record
> from blast_result using an index is plenty fast ( ~ 50 ms), so my
> primary concern is pulling back larger subsets of data.

maybe. you may have had table bloat as well, cluster does a full table
rebuild like vacuum fuul.

> It appears that count(*) on a CLUSTERed table uses the index (as opposed
> to the old way of doing a sequential scan). Count on the table after
> CLUSTER appears to be a *lot* faster, maybe almost 100x. I know we
> shouldn't count, but we've been too lazy to keep the record counts in
> another table, and our customers occasionally want to know how many
> records are in a certain subset.

no, afaik count(*) uses the table still (try explain analyze). you
just compacted and optimized the table for efficient sequential scans.
are you vacuuming regulary?

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message John Purser 2006-08-11 15:05:44 Re: VACUUM VERBOSE output to STDERR
Previous Message Tom Laudeman 2006-08-11 14:12:40 Re: Tuning to speed select