Re: patch for new feature: Buffer Cache Hibernation

From: Mitsuru IWASAKI <iwasaki(at)jp(dot)FreeBSD(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Cc: jeff(dot)janes(at)gmail(dot)com
Subject: Re: patch for new feature: Buffer Cache Hibernation
Date: 2011-05-05 09:06:45
Message-ID: 20110505.180645.48450792.iwasaki@jp.FreeBSD.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> I think that PgFincore (http://pgfoundry.org/projects/pgfincore/)
> provides similar functionality. Are you familiar with that? If so,
> could you contrast your approach with that one?

I'm not familiar with PgFincore at all sorry, but I got source code
and documents and read through them just now.
# and I'm a novice on postgres actually...
The target both is to reduce physical I/O, but their approaches and
gains are different.
My understanding is like this;

+---------------------+ +---------------------+
| Postgres(backend) | | Postgres |
| +-----------------+ | | |
| | DB Buffer Cache | | | |
| | (shared buffers)| | | |
| |*my target | | | |
| +-----------------+ | | |
| ^ ^ | | |
| | | | | |
| v v | | |
| +-----------------+ | | +-----------------+ |
| | buffer manager | | | | pgfincore | |
| +-----------------+ | | +-----------------+ |
+---^------^----------+ +----------^----------+
| |smgrread() |posix_fadvise()
|read()| | userland
==================================================================
| | | kernel
| +-------------+-------------+
| |
| v
| +------------------------+
| | File System |
| | +-----------------+ |
+------>| | FS Buffer Cache | |
| |*PgFincore target| |
| +-----------------+ |
| ^ ^ |
+----|-------|-----------+
| |
==================================================================
| | hardware
+---------|-------|----------------+
| | v Physical Disk |
| | +------------------+ |
| | | base/16384/24598 | |
| v +------------------+ |
| +------------------------------+ |
| |Buffer Cache Hibernation Files| |
| +------------------------------+ |
+----------------------------------+

In summary, PgFincore's target is File System Buffer Cache, Buffer
Cache Hibernation's target is DB Buffer Cache(shared buffers).

PgFincore is trying to preload database file by posix_fadvise() into
File System Buffer Cache, not into DB Buffer Cache(shared buffers).
On query execution, buffer manager will get DB buffer blocks by
smgrread() from file system unless necessary blocks exist in DB Buffer
Cache. At this point, physical reads may not happen because part of
(or entire) database file is already loaded into FS Buffer Cache.

The gain depends on the file system, especially size of File System
Buffer Cache.
Preloading database file is equivalent to following command in short.
$ cat base/16384/24598 > /dev/null

I think PgFincore is good for data warehouse in applications.

Buffer Cache Hibernation, my approach, is more simple and straight forward.
It try to save/load the contents of DB Buffer Cache(shared buffers) using
regular files(called Buffer Cache Hibernation Files).
At startup, buffer manager will load DB buffer blocks into DB Buffer
Cache from Buffer Cache Hibernation Files which was saved at the last
shutdown. Note that database file will not be read, so it is not
cached in File System Buffer Cache at all. Only contents of DB Buffer
Cache are filled. Therefore, the DB buffer cache miss penalty would
be larger than PgFincore's.

The gain depends on the size of shared buffers, and how often the
similar queries are executed before and after restarting.

Buffer Cache Hibernation is good for OLTP in applications.

I think that PgFincore and Buffer Cache Hibernation is not exclusive,
they can co-work together in different caching levels.

Sorry for my poor english skill, but I'm doing my best :)

Thanks

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2011-05-05 09:21:36 Backpatching of "Teach the regular expression functions to do case-insensitive matching"
Previous Message Hitoshi Harada 2011-05-05 08:15:16 Re: Pull up aggregate subquery