Re: Just-in-time Background Writer Patch+Test Results

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Just-in-time Background Writer Patch+Test Results
Date: 2007-09-07 17:48:30
Message-ID: Pine.GSO.4.64.0709071324380.7439@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 7 Sep 2007, Simon Riggs wrote:

> I think we should do some more basic tests to see where those outliers
> come from. We need to establish a clear link between number of dirty
> writes and response time.

With the test I'm running, which is specifically designed to aggrevate
this behavior, the outliers on my system come from how Linux buffers
writes. I can adjust them a bit by playing with the parameters as
described at http://www.westnet.com/~gsmith/content/linux-pdflush.htm but
on the hardware I've got here (single 7200RPM disk for database, another
for WAL) they don't move much. Once /proc/meminfo shows enough Dirty
memory that pdflush starts blocking writes, game over; you're looking at
multi-second delays before my plain old IDE disks clear enough debris out
to start responding to new requests even with the Areca controller I'm
using.

> Perhaps output the number of dirty blocks written on the same line as
> the output of log_min_duration_statement so that we can correlate
> response time to dirty-block-writes on that statement.

On Linux at least, I'd expect this won't reveal much. There, the
interesting correlation is with how much dirty data is in the underlying
OS buffer cache. And exactly how that plays into things is a bit strange
sometimes. If you go back to Heikki's DBT2 tests with the background
writer schemes he tested, he got frustrated enough with that disconnect
that he wrote a little test program just to map out the underlying
weirdness:
http://archives.postgresql.org/pgsql-hackers/2007-07/msg00261.php

I've confirmed his results on my system and done some improvements to that
program myself, but pushed further work on it to the side to finish up the
main background writer task instead. I may circle back to that. I'd
really like to run all this on another OS as well (I have Solaris 10 on my
server box but not fully setup yet), but I can only volunteer so much time
to work on all this right now.

If there's anything that needs to be looked at more carefully during tests
in this area, it's getting more data about just what the underlying OS is
doing while all this is going on. Just the output from vmstat/iostat is
very informative. Those using DBT2 for their tests get some nice graphs
of this already. I've done some pgbench-based tests that included that
before that were very enlightening but sadly that system isn't available
to me anymore.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-09-07 17:49:33 Re: Low hanging fruit in lazy-XID-assignment patch?
Previous Message Florian G. Pflug 2007-09-07 17:28:50 Re: Low hanging fruit in lazy-XID-assignment patch?