I may have a workable idea on a way to add caching to Postgres without
disturbing the MVCC functionality.
Caching, as I've been reading can provide an amazing and sometimes
almost unbelievable performance boost to a database based application,
especially for data that is rarely modified.
The problem, as Tom Lane suggested, and the reason caching hasn't become
more popular is simply because it's very complex to keep up to date
especially in an MVCC environment. But, most every database has some
tables (and usually these are some of the most popular) that would
benefit greatly from caching. For example security permissions or product
descriptions and prices in an e-commerce application. Other tables,
however, like orders and accounting, are very dynamic and cannot afford
to ever be stale.
However, the problem is surmountable and has been figured out by Oracle,
although I don't know how they did it:
What if there was a way to automatically setup caching so data would
only ever be cached when it's not being changed?
I think that may be possible with an extendion the Postgresql api that
tracks which tables are involved in a transaction. Queries could be
cached until a transaction starts to modify a table. At that point, the
api passes all queries that reference that table to the database for
In reality, even these tables that are very active may see great speed
improvements from caching because most of the time transactions don't
make any modifications until they are near the end of the transaction.
Would it be possible to combine a special memcache implementation of
memcache with a Postgresql interface wrapper?
If the caching was implemented in an api, perhaps one could also add
When Stale Data is OK
During a query request, let the application specifically say if they
would be OK with stale data no older than ___ seconds...
Yes, this would be a change to the api, but for those applications that
can withstand receiving stale data, it could provide a significant
performance boost on very active tables. I ran across a report recently that suggested for some
applications/tables, a 10 second delay can reduce database hits by over
Let the caching system automatically track how often a particular table
benefits from caching. If it doesn't, or its a critical table as defined by the database administrator always bypass caching on that table.
On some tables, like web sessions, it may be worthwhile to implement a
batching function where updates are written to disk (to be acid
compliant) and can later be more efficiently processed by the database
in a batch. Of course, the api would have to detect any queries touching
that table, and run the batch first, but it seems that some performance
gains are available since most of the requests will be for a single row,
and that would be available in the cache.
pgsql-hackers by date
|Next:||From: mark||Date: 2006-10-15 14:40:13|
|Subject: Re: Postgresql Caching|
|Previous:||From: Markus Schaber||Date: 2006-10-15 09:42:45|
|Subject: Re: SQL functions, INSERT/UPDATE/DELETE RETURNING, and|