Thank you very much for your quick response.
The servers are using Areca 1600 series controllers with battery backup and 2GB cache.
I really enjoyed your book (actually, both of the books your company published). Found them extremely helpful and they filled a lot of gaps in my still gappy knowledge J
From: Greg Smith [mailto:greg(at)2ndquadrant(dot)com]
Sent: Tuesday, February 01, 2011 4:54 AM
To: Benjamin Krajmalnik
Subject: Re: [PERFORM] Configuration for a new server.
Benjamin Krajmalnik wrote:
have a new set of servers coming in - Dual Xeon E5620's, 96GB RAM, 18 spindles (1 RAID1 for OS - SATA, 12 disk RAID10 for data - SAS, RAID-1 for logs - SAS, 2 hot spares SAS).
You didn't mention the RAID controller and its cache setup. That's a critical piece to get write, err, right. Presumably you've got a battery-backed RAID cache on your SAS controller. Knowing that and what model it is (to make sure it's one of the ones that performs well) would be good info to pass along here.
Is the 25% RAM for shared memory still a good number to go with for this size server?
Several people have reported to me they see drop-offs in performance between 8GB and 10GB for that setting. I currently recommend limiting shared_buffers to 8GB until we have more data on why that is. You suggested already having checkpoint issues, too; if that's true, you don't want to dedicate too much RAM to the database for that reason, too.
There are approximately 50 tables which get updated with almost 100% records updated every 5 minutes - what is a good number of autovacuum processes to have on these? The current server I am replacing only has 3 of them but I think I may gain a benefit from having more.
Watch pg_stat_user_tables and you can figure this out for your workload. There are no generic answers in this area.
Currently I have what I believe to be an aggressive bgwriter setting as follows:
bgwriter_delay = 200ms # 10-10000ms between rounds
bgwriter_lru_maxpages = 1000 # 0-1000 max buffers written/round
bgwriter_lru_multiplier = 10 # 0-10.0 multipler on buffers scanned/round
Does this look right?
You'd probably be better off decreasing the delay rather than pushing up the other two parameters. It's easy to tell if you did it right or not; just look at pg_stat_bgwriter. If buffers_backend is high relative to the others, that means the multiplier or delay is wrong. Or if maxwritten_clean is increasing fast, that means bgwriter_lru_maxpages is too low.
These are values which I arrived to by playing with them to make sure that the end user performance did not suffer. The checkpoints are taking about 8 minutes to complete, but between checkpoints the disk i/o on the data partition is very minimal - when I had lower segments running a 15 minute timeout with a .9 completion target, the platform was fairly slow vis-à-vis the end user.
The completion target isn't the main driver here, the number of segments/timeout is. When you space checkpoints out further, the actual amount of total I/O the server does decreases, both to the WAL and to the main database. So I suspect your tweaking the target had little impact, and it's possible you might even get smoother performance if you put it back to a higher value again.
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
In response to
pgsql-performance by date
|Next:||From: Benjamin Krajmalnik||Date: 2011-02-01 17:22:21|
|Subject: Re: Configuration for a new server.|
|Previous:||From: Ross J. Reedstrom||Date: 2011-02-01 16:42:53|
|Subject: Re: postgres 9 query performance|