Postgresql Caching

From: Anon Mous <soundami(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Postgresql Caching
Date: 2006-10-15 10:41:25
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I may have a workable idea on a way to add caching to Postgres without

disturbing the MVCC functionality.

Caching, as I've been reading can provide an amazing and sometimes

almost unbelievable performance boost to a database based application,

especially for data that is rarely modified.

The problem, as Tom Lane suggested, and the reason caching hasn't become

more popular is simply because it's very complex to keep up to date

especially in an MVCC environment. But, most every database has some

tables (and usually these are some of the most popular) that would

benefit greatly from caching. For example security permissions or product

descriptions and prices in an e-commerce application. Other tables,

however, like orders and accounting, are very dynamic and cannot afford

to ever be stale.

However, the problem is surmountable and has been figured out by Oracle,

although I don't know how they did it:

What if there was a way to automatically setup caching so data would

only ever be cached when it's not being changed?

I think that may be possible with an extendion the Postgresql api that

tracks which tables are involved in a transaction. Queries could be

cached until a transaction starts to modify a table. At that point, the

api passes all queries that reference that table to the database for


In reality, even these tables that are very active may see great speed

improvements from caching because most of the time transactions don't

make any modifications until they are near the end of the transaction.


Would it be possible to combine a special memcache implementation of

memcache with a Postgresql interface wrapper?

Additional Possibilities

If the caching was implemented in an api, perhaps one could also add
connection caching.

When Stale Data is OK

During a query request, let the application specifically say if they

would be OK with stale data no older than ___ seconds...

Yes, this would be a change to the api, but for those applications that

can withstand receiving stale data, it could provide a significant

performance boost on very active tables. I ran across a report recently that suggested for some
applications/tables, a 10 second delay can reduce database hits by over
50 percent.

Automatic Tuning

Let the caching system automatically track how often a particular table

benefits from caching. If it doesn't, or its a critical table as defined by the database administrator always bypass caching on that table.

Write Caching

On some tables, like web sessions, it may be worthwhile to implement a

batching function where updates are written to disk (to be acid

compliant) and can later be more efficiently processed by the database

in a batch. Of course, the api would have to detect any queries touching

that table, and run the batch first, but it seems that some performance

gains are available since most of the requests will be for a single row,

and that would be available in the cache.




Browse pgsql-hackers by date

  From Date Subject
Next Message mark 2006-10-15 14:40:13 Re: Postgresql Caching
Previous Message Markus Schaber 2006-10-15 09:42:45 Re: SQL functions, INSERT/UPDATE/DELETE RETURNING, and