Re: Potential memory usage issue

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: David Brain <dbrain(at)bandwidth(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Potential memory usage issue
Date: 2007-03-22 17:03:35
Message-ID: 20070322130335.e142faed.wmoran@collaborativefusion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In response to David Brain <dbrain(at)bandwidth(dot)com>:

> Bill Moran wrote:
>
> >
> > Install the pg_bufferstats contrib module and take a look at how shared
> > memory is being use. I like to use MRTG to graph shared buffer usage
> > over time, but you can just do a SELECT count(*) WHERE NOT NULL to see
> > how many buffers are actually in use.
> >
>
> Can you explain what you'd use as a diagnostic on this - I just
> installed the module - but I'm not entirely clear as to what the output
> is actually showing me and/or what would be considered good or bad.

Well, there are different things you can do with it. See the README, which
I found pretty comprehensive.

What I was referring to was the ability to track how many shared_buffers
were actually in use, which can easily be seen at a cluster-wide view
with two queries:
select count(*) from pg_buffercache;
select count(*) from pg_buffercache where reldatabase is not null;

The first gives you the total number of buffers available (you could get
this from your postgresql.conf as well, but with automated collection and
graphing via mrtg, doing it this way guarantees that we'll always know
what the _real_ value is) The second gives you the number of buffers
that are actually holding data.

If #2 is smaller than #1, that indicates that the entire working set of
your database is able to fit in shared memory. This might not be your
entire database, as some tables might never be queried from (i.e. log
tables that are only queried when stuff goes wrong ...) This means
that Postgres is usually able to execute queries without going to the
disk for data, which usually equates to fast queries. If it's
consistently _much_ lower, it may indicate that your shared_buffers
value is too high, and the system may benefit from re-balancing memory
usage.

If #2 is equal to #1, it probably means that your working set is larger
than the available shared buffers, this _may_ mean that your queries are
using the disk a lot, and that you _may_ benefit from increasing
shared_buffers, adding more RAM, sacrificing a 15000 RPM SCSI drive to
the gods of performance, etc ...

Another great thing to track is read activity. I do this via the
pg_stat_database table:
select sum(blks_hit) from pg_stat_database;
select sum(blks_read) from pg_stat_database;

(Note that you need block-level stats collecting enabled to make these
usable)

If the second one is increasing particularly fast, that's a strong
indication that more shared_memory might improve performance. If
neither of them are increasing, that indicates that nobody's really
doing much with the database ;)

I strongly recommend that you graph these values using mrtg or cacti
or one of the many other programs designed to do that. It makes life
nice when someone says, "hey, the DB system was really slow yesterday
while you where busy in meetings, can you speed it up."

--
Bill Moran
Collaborative Fusion Inc.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Stone 2007-03-22 17:10:08 Re: Parallel Vacuum
Previous Message Craig A. James 2007-03-22 17:02:00 Re: Performance of count(*)