Re: Who is causing all this i/o?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig James <craig_james(at)emolecules(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, pgsql-admin(at)postgresql(dot)org
Subject: Re: Who is causing all this i/o?
Date: 2011-05-21 15:11:12
Message-ID: 1174.1305990672@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Craig James <craig_james(at)emolecules(dot)com> writes:
> On 5/20/11 4:25 PM, Scott Marlowe wrote:
>> On Fri, May 20, 2011 at 3:14 PM, Craig James<craig_james(at)emolecules(dot)com> wrote:
>>> Our development server (PG 8.4.4 on Ubuntu server) is constantly doing
>>> something, and I can't figure out what. The two production servers, which
>>> are essentially identical, don't show these symptoms. In a nutshell, it's
>>> showing 10K blocks per second of data going out, all the time, and
>>> essentially zero blocks per second of input.
>>> After a lot of digging around, I found this in the /postgres/pg_stat_tmp
>>> directory. If I list the directory including the i-nodes once every second,
>>> I find that a new 2MB file is being created roughly once every two seconds:

>> Have you got a lot of databases in your development environment? I
>> think that can sometimes cause a lot of pg_stat writes.

> Yes. The production servers have a dozen or so databases, but the development server has a couple hundred databases. Does that count as "a lot of databases"?

Yeah. I think what is happening is that the autovacuum launcher is
visiting every database, doing accesses to the system catalogs (and not
much more than that), which results in access-count updates in the stats
collector, which have to get written to disk.

What's not apparent however is why the stats collector is writing disk
so much. 8.4 does have the logic change to not write stats out unless
something is asking to see them. So either it's really pre-8.4, or you
have a monitoring task that is constantly asking to see stats.

One possible band-aid solution is to increase autovacuum_naptime. This
is defined as the AV cycle time *for each database*, so AV wakes up and
touches another database every naptime/#databases seconds. If your
number of databases has been growing over time, this would probably
explain why the problem is getting worse.

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2011-05-21 15:16:09 Re: ERROR: could not read block
Previous Message Wolfgang Keller 2011-05-21 13:19:12 Re: visualizing database schema - png/jpeg?