Database Caching

From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Database Caching
Date: 2002-02-28 22:23:46
Message-ID: E16gYpD-0007KY-00@mclean.mail.mindspring.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I had to make a relatively long drive yesterday, so I had lots of free time
to do some thinking...and my thoughts were turning to caching and databases.
The following is what I came up with: forgive me if it seems to be just an
obvious ramble...

Why does a database need caching?

Normally, when one thinks of a database (or to be precise, a RDBMS) the
ACID acronym comes up. This is concerned with having a stable database that
can reliably be used by many users at the same time. Caching a query is
unintuitive because it involves sharing information from transactions that
may be separated by a great amount of time and/or by different users.
However, from the standpoint of the database server, caching increases
efficiency enormously. If 800 users all make the same query, then caching
can help the database server backend (hereafter simply "database") to
save part or all of the work it performs so it doesn't have to repeat the
entire sequence of steps 800 times.

What is caching?

Caching basically means that we want to save frequently-used information
into an easy to get to area. Usually, this means storing it into memory.
Caching has three main goals: reducing disk access, reducing computation
(i.e. CPU utilization), and speeding up the time as measured by how long a
it takes a user to seea result. It does all this at the expense of RAM,
and the tradeoff is almost always worth it.

In a database, there are three basic types of caching: query results,
query plans, and relations.

The first, query result caching, simply means that we store into memory
the exact output of a SELECT query for the next time that somebody performs
that exact same SELECT query. Thus, if 800 people do a "SELECT * FROM foo",
the database runs it for the first person, saves the results, and simply
reads the cache for the next 799 requests. This saves the database from doing
any disk access, practically removes CPU usage, and speeds up the query.

The second, query plan caching, involves saving the results of the optimizer,
which is responsible for figuring out exactly "how" the databse is going to
fetch the requested data. This type of caching usually involves a "prepared"
query, which has almost all of the information needed to run the query with
the exception of one or more "placeholders" (spots that are populated with
variables at a later time). The query could also involve non-prepared
statments as well. Thus, if someone prepares the query "SELECT flavor FROM
foo WHERE size=?", and then executes it by sending in 300 different values
for "size", the prepared statement is run through the optimizer, the r
esulting path is stored into the query plan cache, and the stored path is
used for the 300 execute requests. Because the path is already known, the
optimizer does not need to be called, which saves the database CPU and time.

The third, relation caching, simply involves putting the entire relation
(usually a table or index) into memory so that it can be read quickly.
This saves disk access, which basically means that it saves time. (This type
of caching also can occur at the OS level, which caches files, but that will
not be discussed here).

Those are the three basic types of caching, ways of implementing each are
discussed below. Each one should complement the other, and a query may be
able to use one, two, or all three of the caches.

I. Query result caching:

A query result cache is only used for SELECT queries that involve a
relation (i.e. not for "SELECT version") Each cache entry has the following
fields: the query itself, the actual results, a status, an access time, an
access number, and a list of all included columns. (The column list actually
tells as much information as needed to uniquely identify it, i.e. schema,
database, table, and column). The status is merely an indicator of whether or
not this cached query is valid. It may not be, because it may be invalidated
for a user within a transaction but still be of use to others.

When a select query is processed, it is first parsed apart into a basic common
form, stripping whitespace, standardizing case, etc., in order to facilitate
an accurate match. Note that no other pre-processing is really required,
since we are only interested in exact matches that produce the exact same
output. An advanced version of this would ideally be able to use the cached
output of "SELECT bar,baz FROM foo" when it receives the query "SELECT
baz,bar FROM foo", but that will require some advanced parsing. Possible,
but probably not something to attempt in the first iteration of a query
caching function. :) If there *is* a match (via a simple strcmp at first),
and the status is marked as "valid", then the database simply uses the
stored output, updates the access time and count, and exits. This should be
extremely fast, as no disk access is needed, and almost no CPU. The
complexity of the query will not matter either: a simple query will run just
as fast as something with 12 sorts and 28 joins.

If a query is *not* already in the cache, then after the results are found
and delivered to the user, the database will try and store them for the
next appearance of that query. First, the size of the cache will be compared
to the size of the query+output, to see if there is room for it. If there
is, the query will be saved, with a status of valid, a time of 'now', a count
of 1, a list of all affected columns found by parsing the query, and the total
size of the query+output. If there is no room, then it will try to delete one
or more to make room. Deleting can be done based on the oldest access time,
smallest access count, or size of the query+output. Some balance of the first
two would probably work best, with the access time being the most important.
Everything will be configurable, of course.

Whenever a table is changed, the cache must be checked as well. A list of
all columns that were actually changed is computed and compared against
the list of columns for each query. At the first sign of a match, the
query is marked as "invalid." This should happen before the changes are made
to the table itself. We do not delete the query immediately since this may
be inside of a transaction, and subject to rollback. However, we do need
to mark it as invalid for the current user inside the current transaction:
thus, the status flag. When the transaction is commited, all queries that have
an "invalid" flag are deleted, then the tables are changed. Since the only
time a query can be flagged as "invalid" is inside your own transaction,
the deletion can be done very quickly.

II. Query plan caching

If a query is not cached, then it "falls through" to the next level of
caching, the query plan. This can either be automatic or strictly on a
user-requested format (i.e. through the prepare-execute paradigm). The latter
is probably better, but it also would not hurt much to store non-explicitly
prepared queries in this cache as long as there is room. This cache has a
field for the query itself, the plan to be followed (i.e. scan this table,
that index, sort the results, then group them), the columns used, the access
time, the access count, and the total size. It may also want a simple flag
of "prepared or non-prepared", where prepared indicates an explicitly
prepared statment that has placeholders for future values. A good optimizer
will actually change the plan based on the values plugged in to the prepared
queries, so that information should become a part of the query itself as
needed, and multiple queries may exist to handle different inputs. In
general, most of the inputs will be similar enough to use the same path (e.g.
"SELECT flavor FROM foo WHERE size=?" will most usually result in a simple
numeric value for the executes). If a match *is* found, then the database
can use the stored path, and not have to bother calling up the optimizer
to figure it out. It then updates the access time, the access count, and
continues as normal. If a match was *not* found, then it might possibly
want to be cached. Certainly, explicit prepares should always be cached.
Non-explicitly prepared queries (those without placeholders) can also be
cached. In theory, some of this will also be in the result cache, so that
should be checked as well: it it is there, no reason to put it here. Prepared
queries should always have priority over non-prepared, and the rest of the
rules above for the result query should also apply, with a caveat that things
that would affect the output of the optimizer (e.g. vacuuming) should also
be taken into account when deleting entries.

III. Relation caching

The final cache is the relation itself, and simply involves putting the entire
relation into memory. This cache has a field for the name of the relation,
the table info itself, the type (indexes should ideally be cached more than
tables, for example), the access time, and the acccess number. Loading could
be done automatically, but most likely should be done according to a flag
on the table itself or as an explicit command by the user.

Notes:

The "strcmp" used may seem rather crude, as it misses all but the exact
same query, but it does cover most of the everyday cases. Queries are
usually called through an application that keeps it in the same format,
time after time, so the queries are very often exactly identical. A better
parser would help, of course, but it would get rather complicated quickly.
Two quick examples: a web page that is read from a database is a query that
is called many times with exactly the same syntax; a user doing a "refresh"
to constantly check if something has changed since they last looked.

Sometimes a query may jump back to a previous type of cache, especially
for things like subselects. The entire subselect query may not match,
but the inner query should also be checked against the query result cache.

Each cache should have some configurable parameters, including the size in
RAM, the maximum number of entries, and rules for adding and deleting.
They should also be directly viewable through a system table, so a DBA
can quickly see exactly which queries are being cached and how often
they are being used. There should be a command to quickly flush the cache,
remove "old" entries, and to populate the query plan cache via a prepare
statment. It should also be possible to do table changes without stopping
to check the cache: perhaps flushing the cache and setting a global
"cache is empty" flag would suffice.

Another problem is the prepare and execute: you are not always guaranteed
to get a cached prepare if you do an execute, as it may have expired or
there may simply be no more room. Those prepared statements inside a
transaction should probably be flagged as "non-deletable" until the
transaction is ended.

Storing the results of an execute in the query result cache is another
problem. When a prepare is made, the database returns a link to that
exact prepared statement in cache, so that all the client has to say is
"run the query at 0x4AB3 with the value "12". Ideally, the database
should be able to check these against the query result cache as well. It
can do this by reconstructing the resulting query (by plugging the value into
the prepared statement) or it can store the execute request as a type of
query itself; instead of "SELECT baz FROM bar WHERE size=12" it would
store "p0x4aB3:12".

The result cache would probaby be the easiest to implement, and also gives
the most "bang for the buck." The path cache may be a bit harder, but is
a very necessary feature. I don't know about the relation caching: it looks
to be fairly easy, and I don't trust that, so I am guessing it is actually
quite difficult.

Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200202281132

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE8fqybvJuQZxSWSsgRAps9AKDwCkIH7GKSBjflyYSA0F7mQqD1MwCeJLCw
hqE1SxJ2Z7RxFGCu3UwIBrI=
=jlBy
-----END PGP SIGNATURE-----

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-02-28 23:07:20 Re: elog() patch
Previous Message Tom Lane 2002-02-28 21:48:21 Re: killed select?