Re: PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Stefan Keller <sfkeller(at)gmail(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?
Date: 2012-02-28 22:41:34
Message-ID: CAGTBQpY16=6GLsQG=O2FFh4zfBKof6u9r=C94YA0Em6vPL76zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Feb 28, 2012 at 5:48 PM, Stefan Keller <sfkeller(at)gmail(dot)com> wrote:
> P.S. And yes, the database is aka 'read-only' and truncated and
> re-populated from scratch every night. fsync is off so I don't care
> about ACID. After the indexes on name, hstore and geometry are
> generated I do a VACUUM FULL FREEZE. The current installation is a
> virtual machine with 4GB memory and the filesystem is "read/write".
> The future machine will be a pizza box with 72GB memory.

I don't get this. Something's wrong.

In the OP, you say "There is enough main memory to hold all table
contents.". I'm assuming, there you refer to your current system, with
4GB memory.

So your data is less than 4GB, but then you'll be throwing a 72GB
server? It's either tremendous overkill, or your data simply isn't
less than 4GB.

It's quite possible the vacuum full is thrashing your disk cache due
to maintainance_work_mem. You can overcome this issue with the tar
trick, which is more easily performed as:

tar cf /dev/null $PG_DATA/base

tar will read all the table's contents and populate the OS cache. From
there to shared_buffers it should be very very quick. If it is true
that your data fits in 4GB, then that should fix it all. Beware,
whatever you allocate to shared buffers will be redundantly loaded
into RAM, first in shared buffers, then in the OS cache. So your data
has to fit in 4GB - shared buffers.

I don't think query-based tricks will load everything into RAM,
because you will get sequential scans and not index scans - the
indices will remain uncached. If you forced an index scan, it would
have to read the whole index in random order (random I/O), and that
would be horribly slow. The best way is to tar the whole database into
/dev/null and be done with it.

Another option is to issue a simple vacuum after the vacuum full.
Simple vacuum will just scan the tables and indices, I'm hoping doing
nothing since the vacuum full will have cleaned everything already,
but loading everything both in the OS cache and into shared_buffers.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2012-02-28 22:46:52 Re: problems with set_config, work_mem, maintenance_work_mem, and sorting
Previous Message Tom Lane 2012-02-28 21:54:43 Re: problems with set_config, work_mem, maintenance_work_mem, and sorting