On Fri, Oct 28, 2011 at 2:20 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> hrm -- it doesn't look like you are i/o bound -- postgres is
> definitely the bottleneck. taking a dump off of production is
> throwing something else out of whack which is affecting your other
> band aid solutions might be:
> *) as noted above, implement hot standby and move dumps to the standby
> *) consider adding connection pooling so your system doesn't
> accumulate N processes during dump
Both are in the works. The 1st one is more involved, but I'm going to move
our monitoring to a pool next week, so I at least stop getting locked out.
We'll be moving to a non-superuser user, as well. That's an artifact of the
very early days that we never got around to correcting.
> a better diagnosis might involve:
> *) strace of one of your non-dump proceses to see where the blocking
> is happening
> *) profiling one of your user processes and compare good vs bad time
This only happens at a particularly anti-social time, so we're taking the
easy way out up front and just killing various suspected processes each
night in order to narrow things down. It looks like it is actually an
interaction between a process that runs a bunch of fairly poorly architected
queries running on a machine set up with the wrong time zone, which was
causing it to run at exactly the same time as the backups. We fixed the
time zone problem last night and didn't have symptoms, so that's the
fundamental problem, but the report generation process has a lot of room for
improvement, regardless. There's definitely lots of room for improvement,
so it's now really about picking the resolutions that offer the most bang
for the buck. I think a hot standby for backups and report generation is
the biggest win, and I can work on tuning the report generation at a later
> Is there anything out of the ordinary about your application that's
> worth mentioning? using lots of subtransactions? prepared
> transactions? tablespaces? huge amounts of tables? etc?
Nope. Pretty normal.
> Have you checked syslogs/dmesg/etc for out of the ordinary system events?
Thanks for taking the time to go through my information and offer up
In response to
pgsql-performance by date
|Next:||From: Mohamed Hashim||Date: 2011-10-29 04:10:12|
|Subject: Re: Performance Problem with postgresql 9.03, 8GB
RAM,Quadcore Processor Server--Need help!!!!!!!|
|Previous:||From: Merlin Moncure||Date: 2011-10-28 21:20:34|
|Subject: Re: backups blocking everything|