Re: DB cache size strategies

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DB cache size strategies
Date: 2004-01-31 01:06:47
Message-ID: 20040131010647.GB31663@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:
> On Friday January 30 2004 2:33, Ed L. wrote:
> >
> > One key aspect of pgsql performance tuning is to adjust the memory
> > consumption settings (shared_buffers, sort_mem, etc) large enough to hold
> > as much of the database in shared memory as possible while not causing
> > page swap-ins. I understand that both page swap-outs and swap space
> > usage is normal and OK, but lots of page swap-ins are bad). In other
> > words, for absolute fastest performance, we want a database buffer cache
> > hit rate of as close to 100% as possible.
>
> I'm also curious about the relationship of DB shared buffer cache to the
> linux/hpux kernel caches. In particular, if the block being retrieved in
> pgsql was in the kernel's cache but not in the DB cache, thereby forcing a
> read() system call, what kind of quantitative difference in performance
> would one expect when comparing with block retrievals coming from the
> cache? I would think they'd differ only by something on the order of
> microseconds. Is the linux kernel disk cache normally a duplicate of much
> of what is in the DB cache? For linux, does the kernel cache use only
> "available" memory until a program needs it, while the pgsql DB cache
> memory is guaranteed at startup?

Whenever the database needs a block not in memory it get loaded and ends up
in both the OS cache and the DB cache. The difference between getting a
block out of DB cache and OS cache is very, very small compared to loading
off disk.

Hence, the larger your DB cache, the less overall data you can cache due to
the duplication. OTOH, postgres can only directly use the DB cache and can
only access the OS cache indirectly.

What I try to aim for is the keep the DB cache somewhere near the "working
set", ie the amount of space used by all the system tables and commonly used
tables in the system. It's shared by all postgres servers in a cluster so
keep that in mind. Also the OS cache may be shared by other processes, so
your results will depend on what other things are happening.

Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct value.

Hope this helps,

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> (... have gone from d-i being barely usable even by its developers
> anywhere, to being about 20% done. Sweet. And the last 80% usually takes
> 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kris Jurka 2004-01-31 01:17:44 Re: Triggers, Stored Procedures, PHP. was: Re: PostgreSQL
Previous Message ljb 2004-01-31 00:58:53 Re: Problem with API (libpq) - detailed error codes