Re: Postgres 9.1.4 - high stats collector IO usage

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: David Barton <dave(at)oneit(dot)com(dot)au>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres 9.1.4 - high stats collector IO usage
Date: 2012-08-13 19:40:28
Message-ID: CAMkU=1wRP1qAkYsJyLd0CZSjCDeaRurQtPbh00snb=yAb=TYGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Aug 12, 2012 at 7:17 PM, David Barton <dave(at)oneit(dot)com(dot)au> wrote:
>>
>> A relatively easy change would be to make any given autovacuum worker
>> on start up tolerate a stats file that is out of date by up to, say,
>> naptime/5. That would greatly reduce the amount of writing the stats
>> collector needs to do (assuming that few tables actually need
>> vacuuming during any given cycle), but wouldn't change the amount of
>> reading a worker needs to do because it still needs to read the file
>> each time as it doesn't inherit the stats from anyone. I don't think
>> it would be a problem that a table which becomes eligible for
>> vacuuming in the last 20% of a cycle would have to wait for one more
>> round. Especially as this change might motivate one to reduce the
>> naptime since doing so will be cheaper.
>
> If the stats are mirrored in memory, then that makes sense. Of course, if
> that's the case then couldn't we just alter the stats to flush at maximum
> once per N seconds / minutes?

The actual flushing is not under our control, but under the kernel's
control. But in essence that is what I am suggesting. If the vacuum
workers ask the stats collector for fresh stats less often, the stats
collector will write them to the kernel less often. If we are writing
them to the kernel less often, the kernel will flush them less often.
The kernel could choose to flush them less often anyway, but for some
reason with ext4 it doesn't.

> If the stats are not mirrored in memory,
> doesn't that imply that most of the databases will never flush updates stats
> to disk and so the file will become stale?

We don't do anything specific to cause them to be mirrored, it is just
that that is the way the kernel deals with frequently accessed
file-system data. If the kernel decided to evict the data from
memory, it would have to make sure it reached disk first. It is the
kernel's job to present a consistent image of all the file-system
data, regardless of whether it is actually in memory, or on disk, or
both. If the data on disk is stale, the kernel guarantees requests to
read it will be served from memory rather than from disk.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Kirkwood 2012-08-13 22:52:35 Re: Index Bloat Problem
Previous Message Robert Klemme 2012-08-13 08:33:24 Re: Deferred constraints performance impact ?