Skip site navigation (1) Skip section navigation (2)

Re: Shared buffers, db transactions commited, and write IO on Solaris

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Shared buffers, db transactions commited, and write IO on Solaris
Date: 2007-03-30 04:48:00
Message-ID: Pine.GSO.4.64.0703300010050.19323@westnet.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Thu, 29 Mar 2007, Erik Jones wrote:

> As far as the procs responsible for the writes go, we were unable to see that 
> from the OS level as the guy we had as a systems admin last year totally 
> screwed us with the way he set up the SunCluster on the boxes and we have 
> been unable to run Dtrace which has left us watching a lot of iostat.

There are two processes spawned by Postgres that handle collecting 
statistics and doing the background writing.  You don't need any fancy 
tools (you Solaris guys and your Dtrace, sheesh) to see if they're busy. 
Just run top and switch the display to show the full command line instead 
of just the process name (on Linux the 'c' key does this) and you'll see 
the processes that had just been "postgres" before label themselves.

The system I saw get nailed by the bug Tom mentioned was also doing an 
update-heavy workload.  It manifested itself as one CPU spending almost 
all its time running the statistics collector.  That process's issues 
kicked up I/O waits from minimal to >25% and the background writer was 
incredibly sluggish as well.  The problem behavior was intermittant in 
that it would be crippling at times, but merely a moderate slowdown at 
others.  Your case sounds similar in several ways.

If you see the stats collector process taking up any significant amount of 
CPU time in top, you should strongly consider the possibility that you're 
suffering from this bug.  It's only a few characters to patch the bug if 
you don't want to wait for the next packaged release.  In your situation, 
I'd do it just eliminate this problem from your list of possible causes 
ASAP.

To fix, edit src/backend/postmaster/pgstat.c
Around line 1650 you'll find:

   write_timeout.it_value.tv_usec = PGSTAT_STAT_INTERVAL % 1000;

Change it to match the current line in the CVS tree for 8.3:

   write_timeout.it_value.tv_usec = (PGSTAT_STAT_INTERVAL % 1000) * 1000;

That's all it took to resolve things for me.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

pgsql-performance by date

Next:From: Peter SchullerDate: 2007-03-30 05:16:45
Subject: Scaling SELECT:s with the number of disks on a stripe
Previous:From: Erik JonesDate: 2007-03-30 04:27:06
Subject: Re: Shared buffers, db transactions commited, and write IO on Solaris

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group