Skip site navigation (1) Skip section navigation (2)

Configuring a Large RAM PostgreSQL Server

From: Alex Hochberger <alex(at)dsgi(dot)us>
To: pgsql-performance(at)postgresql(dot)org
Subject: Configuring a Large RAM PostgreSQL Server
Date: 2007-11-29 18:28:35
Message-ID: C78952BC-8F12-486E-9BD3-368EFAF448AD@dsgi.us (view raw or flat)
Thread:
Lists: pgsql-performance
Does anyone have any white papers or basic guides for a large RAM  
server?

We are consolidating two databases to enable better data-mining that  
currently run on a 4 GB and 2 GB machine.  The data issues on the 4  
GB machine are numerous, things like "create index" fail and update  
queries fail from out of memory issues.  Re-factoring the data is  
helping, but isn't finishing the job.

The new machine will have 48 GB of RAM, so figuring out starting  
points for the Shared Buffers and Work_mem/Maintenance_work_mem is  
going to be a crap shoot, since the defaults still seem to be based  
upon 256MB of RAM or less.

Usage:
    Most of the time, the database is being hit with a handle of  
poorly written and unoptimized queries from a Ruby on Rails app that  
is being refactored as a simple Ruby-DBI app since we need to support  
our legacy code but don't need the Rails environment, just a lot of  
SQL writes.  Some stored procedures should streamline this.  However,  
each transaction will do about 5 or 6 writes.
    Common Usage: we have a reporting tool that is also being  
refactored, but does a lot of aggregate queries.  None of these take  
more than 500 ms after indexing on the 2 GB database, so assuming  
that more RAM should help and eliminate the problems.
    Problem Usage: we have a 20GB table with 120m rows that we are  
splitting into some sub-tables.  Generally, we do large data pulls  
from here, 1 million - 4 million records at a time, stored in a new  
table for export.  These queries are problematic because we are  
unable to index the database for the queries that we run because we  
get out of memory errors.  Most of my cleanup has restored to FOR-IN  
loops via pl-pgsql to manage the data one row at a time.  This is  
problematic because many of these scripts are taking 4-5 days to run.
    Other usage: we will import between 10k and 10m rows at one time  
out of CSVs into the big database table.  I got my gig here because  
this was all failing and the data was becoming worthless.  These  
imports involve a lot of writes.

    Our simultaneous queries are small, and currently run  
acceptably.  It's the big imports, data-mining pulls, and system  
manipulation were we routinely wait days on the query that we are  
looking to speed up.

Thanks,
Alex

Responses

pgsql-performance by date

Next:From: Richard HuxtonDate: 2007-11-29 19:15:15
Subject: Re: Configuring a Large RAM PostgreSQL Server
Previous:From: Csaba NagyDate: 2007-11-29 16:54:32
Subject: Re: TB-sized databases

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group