Skip site navigation (1) Skip section navigation (2)

Re: Configuration Advice

From: Adam Rich <adam(dot)r(at)sbcglobal(dot)net>
To: Steve <cheetah(at)tanabi(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Configuration Advice
Date: 2007-01-17 21:24:02
Message-ID: 20070117212352.7B6EC9FB28E@postgresql.org (view raw or flat)
Thread:
Lists: pgsql-performance
Doesn't sound like you want postgres at all.... Try mysql.



-----Original Message-----
From: "Steve" <cheetah(at)tanabi(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Sent: 1/17/2007 2:41 PM
Subject: [PERFORM] Configuration Advice

Hey there;

I've been lurking on this list awhile, and I've been working with postgres 
for a number of years so I'm not exactly new to this.  But I'm still 
having trouble getting a good balance of settings and I'd like to see what 
other people think.  We may also be willing to hire a contractor to help 
tackle this problem if anyone is interested.

I've got an application here that runs large (in terms of length -- the 
queries have a lot of conditions in them) queries that can potentially 
return millions of rows but on average probably return tens of thousands 
of rows.  It's read only for most of the day, and pretty much all the 
queries except one are really fast.

However, each night we load data from a legacy cobol system into the SQL 
system and then we summarize that data to make the reports faster.  This 
load process is intensely insert/update driven but also has a hefty 
amount of selects as well.  This load process is taking ever longer to 
complete.


SO ... our goal here is to make this load process take less time.  It 
seems the big part is building the big summary table; this big summary 
table is currently 9 million rows big.  Every night, we drop the table, 
re-create it, build the 9 million rows of data (we use COPY to put hte 
data in when it's prepared, not INSERT), and then build the indexes on it 
-- of which there are many.  Unfortunately this table gets queried 
in a lot of different ways and needs these indexes; also unfortunately, we 
have operator class indexes to support both ASC and DESC sorting on 
columns so these are for all intents and purposes duplicate but required 
under Postgres 8.1 (we've recently upgraded to Postgres 8.2, is this still 
a requirement?)

Building these indexes takes forever!  It's a long grind through inserts 
and then building the indexes takes a hefty amount of time too.  (about 9 
hours).  Now, the application is likely part at fault, and we're working 
to make it more efficient, but it has nothing to do with the index 
building time.  I'm wondering what we can do to make this better if 
anything; would it be better to leave the indexes on?  It doesn't seem to 
be.  Would it be better to use INSERTs instead of copies?  Doesn't seem to 
be.


Anyway -- ANYTHING we can do to make this go faster is appreciated :) 
Here's some vital statistics:

- Machine is a 16 GB, 4 actual CPU dual-core opteron system using SCSI 
discs.  The disc configuration seems to be a good one, it's the best of 
all the ones we've tested so far.

- The load process itself takes about 6 gigs of memory, the rest is free 
for postgres because this is basically all the machine does.

- If this was your machine and situation, how would you lay out the emmory 
settings?  What would you set the FSM to?  Would you leave teh bgwriter on 
or off?  We've already got FSYNC off because "data integrity" doesn't 
matter -- this stuff is religeously backed up and we've got no problem 
reinstalling it.  Besides, in order for this machine to go down, data 
integrity of the DB is the least of the worries :)

Do wal_buffers/full_page_writes matter of FSYNC is off?  If so, what 
settings?  What about checkpoints?

Any finally, any ideas on planner constants?  Here's what I'm using:

seq_page_cost = 0.5                     # measured on an arbitrary scale
random_page_cost = 1.0                  # same scale as above
cpu_tuple_cost = 0.001                  # same scale as above
cpu_index_tuple_cost = 0.0001           # same scale as above
cpu_operator_cost = 0.00025             # same scale as above
effective_cache_size = 679006

I really don't remember how I came up with that effective_cache_size 
number....


Anyway... any advice would be appreciated :)


Steve

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Responses

pgsql-performance by date

Next:From: Bricklen AndersonDate: 2007-01-17 21:29:23
Subject: Re: Configuration Advice
Previous:From: SteveDate: 2007-01-17 20:41:45
Subject: Configuration Advice

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group