Re: long running commits

From: "Vaughn, Adam (IMS)" <VaughnA(at)imsweb(dot)com>
To: 'Kevin Grittner' <Kevin(dot)Grittner(at)wicourts(dot)gov>, "'pgsql-admin(at)postgresql(dot)org'" <pgsql-admin(at)postgresql(dot)org>
Cc: "Depuy, Scott (IMS)" <DepuyS(at)imsweb(dot)com>, "Meagher, Kevin (IMS)" <MeagherK(at)imsweb(dot)com>
Subject: Re: long running commits
Date: 2011-03-02 18:10:37
Message-ID: 6B760A41CEE02F46BADF29B4A874D6010F11C13ABF@FALCON.omni.imsweb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thanks for the suggestions. I made all of the changes you mentioned except for the shared_buffers (which will require a downtime I have set for tonight). I do have another question though, why did you pick 512 MB for the new setting of shared_buffers? Everything I've ever read says that 25% of available RAM is a conservative value for shared_buffers.

Also, we had another one of these instances earlier today. During the 23 minute commit a single CPU was at 98% and it looked like all writes were backed up waiting for the commit to finalize. During the time our writing never got above 25 MB/s (far less than we can handle). Is it possible that we're missing an index somewhere or there's something else going on?

Thanks again in advance

-----Original Message-----
From: Kevin Grittner [mailto:Kevin(dot)Grittner(at)wicourts(dot)gov]
Sent: Friday, February 25, 2011 5:45 PM
To: Vaughn, Adam (IMS); 'pgsql-admin(at)postgresql(dot)org'
Cc: Depuy, Scott (IMS); Meagher, Kevin (IMS)
Subject: Re: [ADMIN] long running commits

"Vaughn, Adam (IMS)" <VaughnA(at)imsweb(dot)com> wrote:

> In the past 2 days we have been experiencing problems on our
> production server with very long running commit statements which
> seem to cause the entire instance to become blocked and
> non-responsive. These are the log messages for the 2 commit
> statements (these occurred inside of a 1 hour block)
>
> duration: 945599.193 ms execute S_2: COMMIT
> duration: 967507.896 ms execute S_2: COMMIT
>
> Can anyone provide some assistance in how we should diagnose this
> problem (tools) and some possible solutions for solving it?

Look for prior posts by Greg Smith on this topic. (Or better yet,
buy his book.) I don't believe anyone has a better handle on this
issue than he does.

That said, if you want emergency relief before you hear from him,
and a safer place from which to start tuning toward your optimal
settings based on the process he describes, I would make these
changes:

> bgwriter_delay 100
> bgwriter_lru_maxpages 100
> bgwriter_lru_multiplier 2.2

> shared_buffers 6GB

bgwriter_delay 200
bgwriter_lru_maxpages 1000
bgwriter_lru_multiplier 4

shared_buffers 512MB

These settings may not give you quite the same performance between
lockups, but the lockups should be less frequent and severe, and may
disappear entirely. You may not want to stay here, but it's
probably a relatively safe place from which to start the process of
incremental adjustments to find your best settings.

The overall idea of these settings is to push more of the writes
toward the disk soon enough to prevent a debilitating glut at fsync
time. We're getting better at dealing with this in each release, so
you may want to consider moving forward from 8.3 to help with it,
too.

-Kevin

Information in this e-mail may be confidential. It is intended only for the addressee(s) identified above. If you are not the addressee(s), or an employee or agent of the addressee(s), please note that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender of the error.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Kevin Grittner 2011-03-02 19:04:56 Re: long running commits
Previous Message Jason Clark 2011-03-02 17:11:41 Re: HA warm standby issue with trigger file.