Proposing pg_hibernate

From: Gurjeet Singh <gurjeet(at)singh(dot)im>
To: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposing pg_hibernate
Date: 2014-02-04 00:18:54
Message-ID: CABwTF4Ui_anAG+ybseFunAH5Z6DE9aw2NPdy4HryK+M5OdXCCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Please find attached the pg_hibernate extension. It is a
set-it-and-forget-it solution to enable hibernation of Postgres
shared-buffers. It can be thought of as an amalgam of pg_buffercache and
pg_prewarm.

It uses the background worker infrastructure. It registers one worker
process (BufferSaver) to save the shared-buffer metadata when server is
shutting down, and one worker per database (BlockReader) when restoring the
shared buffers.

It stores the buffer metadata under $PGDATA/pg_database/, one file per
database, and one separate file for global objects. It sorts the list of
buffers before storage, so that when it encounters a range of consecutive
blocks of a relation's fork, it stores that range as just one entry, hence
reducing the storage and I/O overhead.

On-disk binary format, which is used to create the per-database save-files,
is defined as:
1. One or more relation filenodes; stored as r<relfilenode>.
2. Each realtion is followed by one or more fork number; stored as
f<forknumber>
3. Each fork number is followed by one or more block numbers; stored as
b<blocknumber>
4. Each block number is followed by zero or more range numbers; stored as
N<number>

{r {f {b N* }+ }+ }+

Possible enhancements:
- Ability to save/restore only specific databases.
- Control how many BlockReaders are active at a time; to avoid I/O storms.
- Be smart about lowered shared_buffers across the restart.
- Different modes of reading like pg_prewarm does.
- Include PgFincore functionality, at least for Linux platforms.

The extension currently works with PG 9.3, and may work on 9.4 without any
changes; I haven't tested, though. If not, I think it'd be easy to port to
HEAD/PG 9.4. I see that 9.4 has put a cap on maximum background workers via
a GUC, and since my aim is to provide a non-intrusive no-tuning-required
extension, I'd like to use the new dynamic-background-worker infrastructure
in 9.4, which doesn't seem to have any preset limits (I think it's limited
by max_connections, but I may be wrong). I can work on 9.4 port, if there's
interest in including this as a contrib/ module.

To see the extension in action:

.) Compile it.
.) Install it.
.) Add it to shared_preload_libraries.
.) Start/restart Postgres.
.) Install pg_buffercache extension, to inspect the shared buffers.
.) Note the result of pg_buffercache view.
.) Work on your database to fill up the shared buffers.
.) Note the result of pg_buffercache view, again; there should be more
blocks than last time we checked.
.) Stop and start the Postgres server.
.) Note the output of pg_buffercache view; it should contain the blocks
seen just before the shutdown.
.) Future server restarts will automatically save and restore the blocks in
shared-buffers.

The code is also available as Git repository at
https://github.com/gurjeet/pg_hibernate/

Demo:

$ make -C contrib/pg_hibernate/
$ make -C contrib/pg_hibernate/ install
$ vi $B/db/data/postgresql.conf
$ grep shared_preload_libraries $PGDATA/postgresql.conf
shared_preload_libraries = 'pg_hibernate' # (change requires restart)

$ pgstart
waiting for server to start.... done
server started

$ pgsql -c 'create extension pg_buffercache;'
CREATE EXTENSION

$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not
null group by reldatabase;'
count
-------
163
14
(2 rows)

$ pgsql -c 'create table test_hibernate as select s as a, s::char(1000) as
b from generate_series(1, 100000) as s;'
SELECT 100000

$ pgsql -c 'create index on test_hibernate(a);'
CREATE INDEX

$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not
null group by reldatabase;'
count
-------
2254
14
(2 rows)

$ pgstop
waiting for server to shut down....... done
server stopped

$ pgstart
waiting for server to start.... done
server started

$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not
null group by reldatabase;'
count
-------
2264
17
(2 rows)

There are a few more blocks than the time they were saved, but all the
blocks from before the restart are present in shared buffers after the
restart.

Best regards,
--
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com <http://www.enterprisedb.com>

Attachment Content-Type Size
pg_hibernate.tgz application/x-gzip 8.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-02-04 00:40:33 Re: Failure while inserting parent tuple to B-tree is not fun
Previous Message Stephen Frost 2014-02-04 00:13:46 Re: [COMMITTERS] pgsql: Include planning time in EXPLAIN ANALYZE output.