Re: Working on huge RAM based datasets

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Working on huge RAM based datasets
Date: 2004-07-10 13:06:34
Message-ID: m3hdsgovtx.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Quoth andy_ballingall(at)bigfoot(dot)com ("Andy Ballingall"):
> This is the future, isn't it? Each year, a higher percentage of DB
> applications will be able to fit entirely in RAM, and that percentage is
> going to be quite significant in just a few years. The disk system gets
> relegated to a data preload on startup and servicing the writes as the
> server does its stuff.

Regrettably, this may be something that fits better with MySQL, as it
already has an architecture oriented to having different "storage
engines" in behind.

There may be merit to the notion of implementing in-memory databases;
some assumptions change:

- You might use bitmap indices, although that probably "kills" MVCC;

- You might use T-trees rather than B-trees for indices, although
research seems to indicate that B-trees win out if there is a
great deal of concurrent access;

- It can become worthwhile to use compression schemes to fit more
records into memory that wouldn't be worthwhile if using demand
paging.

If you really want to try this, then look at Konstantin Knizhnik's
FastDB system:
http://www.ispras.ru/~knizhnik/fastdb.html

It assumes that your application will be a monolithic C++ process; if
that isn't the case, then performance will probably suffer due to
throwing in context switches.

The changes in assumptions are pretty vital ones, that imply you're
heading in a fairly different direction than that which PostgreSQL
seems to be taking.

That's not to say that there isn't merit to building a database system
using T-trees and bitmap indices attuned to applications where
main-memory storage is key; it's just that the proposal probably
should go somewhere else.
--
output = ("cbbrowne" "@" "cbbrowne.com")
http://www3.sympatico.ca/cbbrowne/languages.html
How does the guy who drives the snowplow get to work in the mornings?

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jan Wieck 2004-07-11 14:12:46 Re: Working on huge RAM based datasets
Previous Message Andy Ballingall 2004-07-10 12:25:37 Re: Working on huge RAM based datasets