Skip site navigation (1) Skip section navigation (2)

Hard Drive Usage for Speeding up Big Queries

From: Alex Hochberger <alex(at)dsgi(dot)us>
To: pgsql-performance(at)postgresql(dot)org
Subject: Hard Drive Usage for Speeding up Big Queries
Date: 2008-01-28 13:54:30
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
We are trying to optimize our Database server without spending a  
fortune on hardware.  Here is our current setup.

Main Drive array: 8x 750 GB SATA 2 Drives in a RAID 10 Configuration,  
this stores the OS, Applications, and PostgreSQL Data drives.  3 TB  
Array, 2 TB Parition for PostgreSQL.
Secondary drive array: 2x 36 GB SAS 15000 RPM Drives in a RAID 1  
Configuration: the pg_xlog directory, checkpoints set to use about 18  
GB max, this way when massive numbers of small writes occur, they  
don't slow the system down.  Drive failure loses no data.  Checkpoints  
will be another matter, hope to keep under control with bgwriter  

Now our "normal" activities are really fast.  Scanning data, etc., all  
runs pretty quickly.  What is NOT fast is some of the massive  
queries.  We have some Updates with joins that match a 100m line table  
with a 200m line table.  Outside of custom coding pl/pgsql code that  
creates the subfunction on the fly (which is up for consideration) to  
try to keep the Matching joins to an O(n) problem from the current  
O(n^2) one, we are looking at hardware as an option to help speed up  
these big batch queries that sometimes run for 5-6 days.

CPU not a problem, 2x Quad-core Xeon, never taxing more than 13%, this  
will change as more of our database functions are brought over here  
from the other servers
RAM is not upgradable, have 48GB of RAM on there.
Work_mem shouldn't be the issue, the big processes get Work_mem set to  
10GB, and if they are using temp tables, another 6-8GB for  
temp_buffers.  Maintenance Mem is set to 2 GB.

However, the joins of two 50GB tables really just can't be solved in  
RAM without using drive space.  My question is, can hardware speed  
that up?  Would putting a 400 GB SAS Drive (15000 RPM) in just to  
handle PostgreSQL temp files help?  Considering it would store "in  
process" queries and not "completed transactions" I see no reason to  
mirror the drive.  If it fails, we'd simply unmount it, replace it,  
then remount it, it could use the SATA space in the mean time.

Would that speed things up, and if so, where in the drive mappings  
should that partition go?

Thank you for your help.  I'm mostly interested in if I can speed  
these things up from 5-6 days to < 1 day, otherwise I need to look at  
optimizing it.



pgsql-performance by date

Next:From: MatthewDate: 2008-01-28 14:27:34
Subject: Re: Linux/PostgreSQL scalability issue - problem with 8 cores
Previous:From: Gregory StarkDate: 2008-01-28 12:41:03
Subject: Re: Performance issues migrating from 743 to 826

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group