Re: Database Caching

From: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Database Caching
Date: 2002-03-01 10:02:05
Message-ID: 20020301110205.B17401@zf.jcu.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 28, 2002 at 10:23:46PM -0000, Greg Sabino Mullane wrote:

> The first, query result caching, simply means that we store into memory
> the exact output of a SELECT query for the next time that somebody performs
> that exact same SELECT query. Thus, if 800 people do a "SELECT * FROM foo",
> the database runs it for the first person, saves the results, and simply
> reads the cache for the next 799 requests. This saves the database from doing
> any disk access, practically removes CPU usage, and speeds up the query.

How expensive is keep cache in consistent state? May be maintain
the result cache is simular to read raw data from disk/buffer cache.

The result cache may be speeds up SELECT, but probably speeds down
UPDATE/INSERT.

> The second, query plan caching, involves saving the results of the optimizer,
> which is responsible for figuring out exactly "how" the databse is going to
> fetch the requested data. This type of caching usually involves a "prepared"
> query, which has almost all of the information needed to run the query with
> the exception of one or more "placeholders" (spots that are populated with
> variables at a later time). The query could also involve non-prepared
> statments as well. Thus, if someone prepares the query "SELECT flavor FROM
> foo WHERE size=?", and then executes it by sending in 300 different values
> for "size", the prepared statement is run through the optimizer, the r
> esulting path is stored into the query plan cache, and the stored path is
> used for the 300 execute requests. Because the path is already known, the
> optimizer does not need to be called, which saves the database CPU and time.

IMHO query plan cache maintained by user's PREPARE/EXECUTE/DEALLOCATE
statements is sufficient, because user good know what change in DB
scheme (drop function, relation...).

The "transparent" query cache for each query that go into backend
(IMHO) will too expensive, because you must check/analyze each query.
I mean more effective is keep in memory fragments of query, for example
operator, relation description -- the cache like this PostgreSQL already
have (syscache).

> The third, relation caching, simply involves putting the entire relation
> (usually a table or index) into memory so that it can be read quickly.

Already done by buffers :-)

Karel

--
Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nevermind 2002-03-01 10:32:01 Re: PostgreSQL Licence: GNU/GPL
Previous Message pgsql-bugs 2002-03-01 09:50:26 Bug #605: timestamp(timestamp('a timestamp)) no longer works