Re: Inverted-list databases (was: Working on huge RAM based datasets)

From: "Mischa Sandberg" <mischa_sandberg(at)telus(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Inverted-list databases (was: Working on huge RAM based datasets)
Date: 2004-07-09 22:23:03
Message-ID: bVEHc.6984$iw3.4810@clgrps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

""Andy Ballingall"" <andy_ballingall(at)bigfoot(dot)com> wrote in message
news:011301c46597$15d145c0$0300a8c0(at)lappy(dot)(dot)(dot)

> On another thread, (not in this mailing list), someone mentioned that
there
> are a class of databases which, rather than caching bits of database file
> (be it in the OS buffer cache or the postmaster workspace), construct a a
> well indexed memory representation of the entire data in the postmaster
> workspace (or its equivalent), and this, remaining persistent, allows the
DB
> to service backend queries far quicker than if the postmaster was working
> with the assumption that most of the data was on disk (even if, in
practice,
> large amounts or perhaps even all of it resides in OS cache).

As a historical note, System R (grandaddy of all relational dbs) worked this
way.
And it worked under ridiculous memory constraints by modern standards.

Space-conscious MOLAP databases do this, FWIW.

Sybase 11 bitmap indexes pretty much amount to this, too.

I've built a SQL engine that used bitmap indexes within B-Tree indexes,
making it practical to index every field of every table (the purpose of the
engine).

You can also build special-purpose in-memory representations to test for
existence (of a key), when you expect a lot of failures. Google
"superimposed coding" e.g. http://www.dbcsoftware.com/dbcnews/NOV94.TXT

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-07-10 02:52:04 Re: inserting into brand new database faster than old database
Previous Message Oliver Jowett 2004-07-09 21:55:35 Re: Cursors performance