So, I've been discussing this because using PostgreSQL on the caching
layer has become more common that I think most people realize. Jonathan
is one of 4 companies I know of who are doing this, and with the growth
of Hadoop and other large-scale data-processing technologies, I think
demand will increase.
Especially as, in repeated tests, PostgreSQL with persistence turned off
is just as fast as the fastest nondurable NoSQL database. And it has a
LOT more features.
Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for
durability, they don't eliminate the CPU time. Which means that a
caching version of PostgreSQL could be even faster. To do that, we'd
a) Eliminate WAL logging entirely
b) Eliminate checkpointing
c) Turn off the background writer
d) Have PostgreSQL refuse to restart after a crash and instead call an
exteral script (for reprovisioning)
Of the three above, (a) is the most difficult codewise. (b)(c) and (d)
should be relatively straightforwards, although I believe that we now
have the bgwriter doing some other essential work besides syncing
buffers. There's also a narrower use-case in eliminating (a), since a
non-fsync'd server which was recording WAL could be used as part of a
This isn't on hackers because I'm not ready to start working on a patch,
but I'd like some feedback on the complexities of doing (b) and (c) as
well as how many people could use a non-persistant, in-memory postgres.
-- Josh Berkus
PostgreSQL Experts Inc.
In response to
pgsql-performance by date
|Next:||From: Pierre C||Date: 2010-06-17 18:44:04|
|Subject: Re: PostgreSQL as a local in-memory cache|
|Previous:||From: Jatinder Sangha||Date: 2010-06-17 16:57:15|
|Subject: HashAggregate slower than sort?|