Re: Caching by Postgres

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Donald Courtney <Donald(dot)Courtney(at)sun(dot)com>, Frank Wiles <frank(at)wiles(dot)org>, gokulnathbabu manoharan <gokulnathbabu(at)yahoo(dot)com>
Subject: Re: Caching by Postgres
Date: 2005-08-23 19:38:04
Message-ID: 200508231238.04844.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Donald,

> This whole issue of data caching is a troubling issue with postreSQL
> in that even if you ran postgreSQL on a 64 bit address space
> with larger number of CPUs you won't see much of a scale up
> and possibly even a drop.

Since when? Barring the context switch bug, you're not going to get a
drop with more processors/more RAM.

You may fail to get any gain, though. If your database is only 100MB in
size, having 11G of cache space isn't going to help you much over having
only 1G.

> I am not alone in having the *expectation*
> that a database should have some cache size parameter and
> the option to skip the file system.

Sure, because that's the conventional wisdom, as writ by Oracle. However,
this comes with substantial code maintenance costs and portability
limitations which have to be measured against any gain in performance.

> If I use oracle, sybase, mysql
> and maxdb they all have the ability to size a data cache and move
> to 64 bits.

And yet, we regularly outperform Sybase and MySQL on heavy OLTP loads on
commodity x86 hardware. So apparently DB caching isn't everything. ;-)

I'm not saying that it's not worth testing larger database caches -- even
taking over most of RAM -- on high-speed systems. In fact, I'm working
on doing that kind of test now. However, barring test results, we can't
assume that taking over RAM and the FS cache would have a substantial
performance benefit; that remains to be shown.

The other thing is that we've had, and continue to have, low-hanging fruit
which have a clear and measurable effect on performance and are fixable
without bloating the PG code. Some of these issues (COPY path, context
switching, locks, GiST concurrency, some aggregates) have been addressed
in the 8.1 code; some remain to be addressed (sorts, disk spill, 64-bit
sort mem, other aggregates, index-only access, etc.). Why tackle a huge,
250-hour project which could fail when a 20-hour patch is more likely to
provide the same performance benefit?

We have the same discussion (annually) about mmap. Using mmap *might*
provide us with a huge performance boost. However, it would *definitely*
require 300hours (or more) of programmer time to test properly, and might
not benefit us at all.

Of course, if *you* want to work on large database cache improvements, be
my guest ... it's an open source project! Submit your patches! I'll be
happy to test them.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message mark 2005-08-23 20:03:42 Re: Caching by Postgres
Previous Message Tom Lane 2005-08-23 19:23:43 Re: Caching by Postgres