Re: Database Caching

From: "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
To: Jan Wieck <janwieck(at)yahoo(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Database Caching
Date: 2002-03-01 15:07:28
Message-ID: 20020301110357.H49236-100000@mail1.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 1 Mar 2002, Jan Wieck wrote:

> Tom Lane wrote:
> > "Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:
> > > III. Relation caching
> >
> > > The final cache is the relation itself, and simply involves putting the entire
> > > relation into memory. This cache has a field for the name of the relation,
> > > the table info itself, the type (indexes should ideally be cached more than
> > > tables, for example), the access time, and the acccess number. Loading could
> > > be done automatically, but most likely should be done according to a flag
> > > on the table itself or as an explicit command by the user.
> >
> > This would be a complete waste of time; the buffer cache (both Postgres'
> > own, and the kernel's disk cache) serves the purpose already.
> >
> > As I've commented before, I have deep misgivings about the idea of a
> > query-result cache, too.
>
> I wonder how this sort of query result caching could work in
> our MVCC and visibility world at all. Multiple concurrent
> running transactions see different snapshots of the table,
> hence different result sets for exactly one and the same
> querystring at the same time ... er ... yeah, one cache set
> per query/snapshot combo, great!
>
> To really gain some speed with this sort of query cache, we'd
> have to adopt the #1 MySQL design rule "speed over precision"
> and ignore MVCC for query-cached relations, or what?

Actually, you are missing, I think, as is everyone, the 'semi-static'
database ... you know? the one where data gets dumped to it by a script
every 5 minutes, but between dumps, there are hundreds of queries per
second/minute between the updates that are the same query repeated each
time ...

As soon as there is *any* change to the data set, the query cache should
be marked dirty and reloaded ... mark it dirty on any update, delete or
insert ...

So, if I have 1000 *pure* SELECTs, the cache is fine ... as soon as one
U/I/D pops up, its invalidated ...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-03-01 15:07:31 Re: elog() patch
Previous Message Tom Lane 2002-03-01 15:03:41 Re: Bug #605: timestamp(timestamp('a timestamp)) no longer works