Re: PostgreSQL as a local in-memory cache

From: "jgardner(at)jonathangardner(dot)net" <jgardner(at)jonathangardner(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL as a local in-memory cache
Date: 2010-06-17 02:33:13
Message-ID: 18287446-8677-46d6-a581-c4777f60ebda@40g2000pry.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Jun 14, 7:14 pm, "jgard(dot)(dot)(dot)(at)jonathangardner(dot)net"
<jgard(dot)(dot)(dot)(at)jonathangardner(dot)net> wrote:
> We have a fairly unique need for a local, in-memory cache. This will
> store data aggregated from other sources. Generating the data only
> takes a few minutes, and it is updated often. There will be some
> fairly expensive queries of arbitrary complexity run at a fairly high
> rate. We're looking for high concurrency and reasonable performance
> throughout.
>
> The entire data set is roughly 20 MB in size. We've tried Carbonado in
> front of SleepycatJE only to discover that it chokes at a fairly low
> concurrency and that Carbonado's rule-based optimizer is wholly
> insufficient for our needs. We've also tried Carbonado's Map
> Repository which suffers the same problems.
>
> I've since moved the backend database to a local PostgreSQL instance
> hoping to take advantage of PostgreSQL's superior performance at high
> concurrency. Of course, at the default settings, it performs quite
> poorly compares to the Map Repository and Sleepycat JE.
>
> My question is how can I configure the database to run as quickly as
> possible if I don't care about data consistency or durability? That
> is, the data is updated so often and it can be reproduced fairly
> rapidly so that if there is a server crash or random particles from
> space mess up memory we'd just restart the machine and move on.
>
> I've never configured PostgreSQL to work like this and I thought maybe
> someone here had some ideas on a good approach to this.

Just to summarize what I've been able to accomplish so far. By turning
fsync and synchronize_commit off, and moving the data dir to tmpfs,
I've been able to run the expensive queries much faster than BDB or
the MapRepository that comes with Carbonado. This is because
PostgreSQL's planner is so much faster and better than whatever
Carbonado has. Tweaking indexes has only made things run faster.

Right now I'm wrapping up the project so that we can do some serious
performance benchmarks. I'll let you all know how it goes.

Also, just a note that setting up PostgreSQL for these weird scenarios
turned out to be just a tiny bit harder than setting up SQLite. I
remember several years ago when there was a push to simplify the
configuration and installation of PostgreSQL, and I believe that that
has born fruit.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kaufhold, Christian (LFD) 2010-06-17 07:52:33 Query slow after analyse on postgresql 8.2
Previous Message Kevin Grittner 2010-06-16 21:19:06 Re: Parallel queries for a web-application |performance testing