Skip site navigation (1) Skip section navigation (2)

Re: Shared buffers, db transactions commited, and write IO on Solaris

From: Erik Jones <erik(at)myemma(dot)com>
To: "dimitri k" <dimitrik(dot)fr(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Shared buffers, db transactions commited, and write IO on Solaris
Date: 2007-03-29 18:58:13
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On Mar 29, 2007, at 12:41 PM, dimitri k wrote:

> On 3/29/07, Erik Jones <erik(at)myemma(dot)com> wrote:
>> On Mar 29, 2007, at 11:16 AM, Tom Lane wrote:
>> > Erik Jones <erik(at)myemma(dot)com> writes:
>> >> We've recently made a couple changes to our system that have  
>> resulted
>> >> in a drastic increase in performance as well as some very  
>> confusing
>> >> changes to the database statistics, specifically
>> >> pg_stat_database.xact_commit.  Here's the details:
>> >
>> > I'm kinda boggled too.  I can see how increasing shared buffers  
>> could
>> > result in a drastic reduction in write rate, if the working set of
>> > your
>> > queries fits in the new space but didn't fit in the old.  I have no
>> > idea
>> > how that leads to a drop in number of transactions committed  
>> though.
>> > It doesn't make sense that autovac would run less frequently,  
>> because
>> > it's driven by number of tuples changed not number of disk  
>> writes; and
>> > that could hardly account for a 10x drop anyway.
>> >
>> > Did you by any chance take note of exactly which processes were
>> > generating all the I/O or the CPU load?
>> Well, wrt to the CPU load, as I said, we're pretty sure that's
>> autovac as we still get spikes that hit about the same threshold,
>> after which cache hits go up dramatically and the spikes just don't
>> last two days anymore.
>> As far as the procs responsible for the writes go, we were unable to
>> see that from the OS level as the guy we had as a systems admin last
>> year totally screwed us with the way he set up the SunCluster on the
>> boxes and we have been unable to run Dtrace which has left us
>> watching a lot of iostat.  However, we did notice a direct
>> correlation between write spikes and "write intensive" queries like
>> large COPYs, UPDATEs, and INSERTs.
>> One very important thing to note here is that the number, or rather
>> rate, of disk writes has not changed.  It's the volume of data in
>> those writes that has dropped, along with those transaction
>> mysterious counts.  Could the bgwriter be the culprit here?  Does
>> anything it does get logged as a transaction?
>> erik jones <erik(at)myemma(dot)com>
>> software developer
>> 615-296-0838
>> emma(r)
> Erik,
> using 'forcedirectio' simply brings your write operations to the
> *real* volume - means while you need to write 10 bytes you'll write 10
> bytes (instead of UFS block size (8K)). So it explains me why your
> write volume became slower.

Sorry, that's not true.  Google "ufs forcedirectio" go to the first  
link and you will find:


The forcedirectio (read "force direct IO") UFS option causes data to  
be buffered in kernel address whenever data is transferred between  
user address space and the disk. In other words, it bypasses the file  
system cache. For certain types of applications -- primarily database  
systems -- this option can dramatically improve performance. In fact,  
some database experts have argued that a file using the forcedirectio  
option will outperform a raw partition, though this opinion seems  
fairly controversial.

The forcedirectio improves file system performance by eliminating  
double buffering, providing a small, efficient code path for file  
system reads and writes and removing pressure on memory."

However, what this does mean is that writes will be at the actual  
filesystem block size and not the cache block size (8K v. 512K).

> Now, why TX number is reduced - is a small mystery :)
> Options:
>   - you really do 10 times less commits, means you work 10 times  
> slower? ;)
>     what about users? how do you measure your work performance?

We are an email marketing service provider with a web front end  
application.  We measure work performance via web requests (counts,  
types, etc...), mailer activity and the resulting database activity.   
We are doing as much or more work now than previously, and faster.

>   - TX reported in pg_* tables are not exact, but I don't believe  
> at all :)

Even if they aren't exact, being off by a factor of 10 wouldn't be  
believable.  the forcedirectio mount setting for ufs can definitely  
explain the drop in data written volume, but doesn't do much to  
explain the difference in xact commits.

erik jones <erik(at)myemma(dot)com>
software developer

In response to


pgsql-performance by date

Next:From: Tom LaneDate: 2007-03-29 19:19:49
Subject: Re: Shared buffers, db transactions commited, and write IO on Solaris
Previous:From: Vincenzo RomanoDate: 2007-03-29 18:56:07
Subject: Weird performance drop

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group