I've just started working with a client that has been running Postgres (with
no DBA) for a few years. They're running version 8.1.4 on 4-way dell boxes
with 4Gig of memory on each box attached to RAID-10 disk arrays.
Some of their key config settings are here:
shared_buffers = 20480
work_mem = 16384
maintenance_work_mem = 32758
wal_buffers = 24
checkpoint_segments = 32
checkpoint_timeout = 300
checkpoint_warning = 30
effective_cache_size = 524288
autovacuum = on
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = -1
autovacuum_vacuum_cost_limit = -1
Currently I've found that they have approx 17 tables that show a significant
amount of bloat in the system. The worst one showing over 5million pages
worth of dead space. One of the problems is that their system is so busy with
activity during the day and massive data load processes at night that they
often kill the pid of vacuum processes kicked off by autovacuum because the
overall load impact disallows users from logging into the app since the login
process includes at least one db query that then seems to hang because there
are anywhere from 100 - 300 queries ahead of it at any given time. Normally a
user gets logged in with an avg wait of 5 - 10 seconds but when a long
running vacuum (sometimes due to a long running update process that's trying
to sort/update > 40million rows) is going the system gets to a state where
the login queries never get executed until the vacuum process is killed.
As a result of this I believe that the biggest table (the one with > 5million
pages worth of dead space) has never been vacuumed to completion. I suspect
this is the case for a few of the other top dead space tables as well but I
can't be sure.
My first priority was to get this vacuum scenario cleaned up. First off I
added the biggest table into pg_autovacuum and set the enabled column to
false ('f'). Then I set vacuum_cost_delay to 10 and in the same session
ran "vacuum analyze verbose big_table". This ran for 7.5 hours before we had
to kill it due to system load - and to make matters worse the high system
load was forcing many of the nightly batch queries that load, update, etc the
data to stack up to a point where the system was at less than 2% idle (CPU)
for the next 4 hours and barely responding to the command line.
To make matters worse I find out this morning that the db is at 85% per used
transaction ID's - again since a vacuum on the entire db has never been
As far as I can tell, the overall db size is currently 199G of which approx
104G seems to be valid data.
Here's my thoughts per how to proceed:
1) fix the big table ASAP (probably over the weekend) since it's not only the
biggest table but the most active like this:
a) run a pg_dump of this table
b) restore this dump into a new table (i.e. new_big_table)
c) lock the original big_table, sync any changes, inserts, deletes since we
did the dump from big_table into new_big_table
d) drop big_table
e) re-name new_big_table to big_table
* I may run through this for a few of the other large, highly active tables
that have minimal page density as well.
The development folks that have been here awhile tell me that it seems like
when they have a query (not limited to vacuum processes) that has been
running for a long time (i.e. > 5 or 6 hours) that the query sort of "goes
crazy" and the entire system gets pegged until they kill that process. - I've
not heard of this but I suspect upgrading to 8.2.4 is probably a good plan at
this point as well, so for step 2, I'll do this:
2) (obviously I'll do this in dev first, then in QA and finally in prod)
a) install verson 8.2.4 from source, leaving 8.1.4 in place
b) create the new 8.2.4 cluster on a new port
c) setup WAL archiving on the 8.1.4 cluster
d) do a full dump of the 8.1.4 cluster and restore it to the new 8.2.4
e) stop the 8.2.4 cluster and bring it up in recovery mode, pointing it to
the directory where we're archiving the 8.1.4 cluster's WAL segments.
f) once caught up, bring both clusters down
g) copy any final files from the 8.1.4 cluster's pg_xlog directory into the
new 8.2.4 pg_xlog dir (is this ok, since I'm moving 8.1.4 version tx logs
into an 8.2.4 xlog dir?)
h) Change the port on the 8.2.4 cluster to what the original 8.1.4 cluster
i) bring up the new 8.2.4 system, and actively manage the vacuum needs
moving fwd via a combination of autovacuum, cron processes for specififed
table vac's (daily, hourly, 15min, 5min, etc), and as needed interactive
The src based install will allow me to setup a robust upgrade CM process
capable of supporting multiple concurrent versions on a server if needed, the
ability to quickly revert to a previous version, etc however this is a
discussion for another day - I only mention it in case the question "why not
just use RPM's?" arises...
So here's my questions:
1) Does this sound like a good plan?
2) Are there other steps I should be taking, other Issues I should be
concerned about short-term, etc?
3) Does anyone have any additional advice for managing either this initial
mess, or the system(s) long term?
Thanks in advance...
pgsql-performance by date
|Next:||From: Gregory Stark||Date: 2007-08-24 20:06:53|
|Subject: Re: When/if to Reindex|
|Previous:||From: Steven Flatt||Date: 2007-08-24 17:49:01|
|Subject: Re: When/if to Reindex|