Re: memory question

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: "Campbell, Lance" <lance(at)illinois(dot)edu>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: memory question
Date: 2010-03-25 02:06:10
Message-ID: dcc563d11003241906y49fe3d37xbeedf4c066aa242d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Mar 24, 2010 at 6:49 PM, Campbell, Lance <lance(at)illinois(dot)edu> wrote:
> PostgreSQL 8.4.3
>
> Linux Redhat 5.0
>
> Question: How much memory do I really need?

The answer is "as much as needed to hold your entire database in
memory and a few gig left over for sorts and backends to play in."

> From my understanding there are two primary strategies for setting up
> PostgreSQL in relationship to memory:
>
>
>
> 1)      Rely on Linux to cache the files.  In this approach you set the
> shared_buffers to a relatively low number.
>
> 2)      You can set shared_buffers to a very large percentage of your memory
> so that PostgreSQL reserves the memory for the database.

The kernel is better at caching large amounts of memory. Pg is better
at handling somewhat smaller amounts and not flushing out random
access data for sequential access data.

> I am currently using option #1.  I have 24 Gig of memory on my server and
> the database takes up 17 Gig of disk space.  When I do the Linux command
> “top” I notice that 19 Gig is allocated for cache.  Is there a way for me to
> tell how much of that cache is associated with the caching of database
> files?

Probably nearly all of that 19G for cache is allocated for pg files.
Not sure how to tell off the top of my head though.

> I am basically asking how much memory do I really need?  Maybe I have
> complete over kill.  Maybe I am getting to a point where I might need more
> memory.

Actually, there are three levels of caching that are possible. 1:
Entire db, tables and indexes, can fit in RAM. This is the fastest
method. Worth the extra $ for RAM if you can afford it / db isn't too
huge. 2: Indexes can fit in RAM, some of tables can. Still pretty
fast. Definitely worth paying a little extra for. 3: Neither indexes
nor tables can wholly fit in RAM. At this point the speed of your
large disk array becomes important, and you want a fast cachine RAID
controller. Both of these items (disk array and RAID controller) are
considerably more costly than 16 or 32 Gigs of RAM.

> My thought was I could use option #2 and then set the number to a lower
> amount.  If the performance is bad then slowly work the number up.

I'm not sure what you mean. Install less RAM and let PG do all the
caching? Usually a bad idea. Usually. I'm sure there are use cases
that it might be a good idea on. But keep in mind, a large amount of
shared_buffers doesn't JUST buffer your reads, it also results in a
much large memory space to keep track of in terms of things that need
to get written out etc. I'm actually about to reduce the
shared_buffers from 8G on one reporting server down to 1 or 2G cause
that's plenty, and it's having a hard time keeping up with the huge
checkpoints it's having to do.

> Our server manager seems to think that I have way to much memory.  He thinks
> that we only need 5 Gig.

How much do you absolutely need to boot up, run postgresql, and not
run out of memory? That's what you "need" and it's probably around
1Gig. It's just no less arbitraty than 5G. Did he show you how he
arrived at this number? If your DB is 17Gig on disk, it's foolish to
be cheap on memory.

> I don’t really believe that.  But I want to cover
> myself.  With money tight I don’t want to be the person who is wasting
> resources.  We need to replace our database servers so I want to do the
> right thing.

You can waste your time (valuable but sunk cost) other people's time
(more valuable, also sunk cost) or "waste" a few dollars on memory.
24Gig isn't that expensive really compared to say 10 seconds per
transaction for 100 users, 1000 times a day. Or 11 user days in a
single day. 10s of seconds start to add up.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-03-25 02:16:59 Re: Forcing index scan on query produces 16x faster
Previous Message Eger, Patrick 2010-03-25 00:59:33 Re: Forcing index scan on query produces 16x faster