Re: Help tuning a large table off disk and into RAM

From: "Jimmy Choi" <JCHOI(at)altera(dot)com>
To: "Bill Moran" <wmoran(at)potentialtech(dot)com>, "James Williams" <james(dot)wlms(at)googlemail(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Help tuning a large table off disk and into RAM
Date: 2007-09-26 15:31:41
Message-ID: 6E3775AF29598B46AA3F102067A510F1020F8434@tor-ismsg01.altera.priv.altera.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Have you tried clustering tables based on the most-frequently used
indexes to improve locality?

http://www.postgresql.org/docs/8.2/static/sql-cluster.html

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Bill Moran
Sent: Wednesday, September 26, 2007 11:24 AM
To: James Williams
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Help tuning a large table off disk and into RAM

In response to "James Williams" <james(dot)wlms(at)googlemail(dot)com>:

> I'm stuck trying to tune a big-ish postgres db and wondering if anyone
> has any pointers.
>
> I cannot get Postgres to make good use of plenty of available RAM and
> stop thrashing the disks.
>
> One main table. ~30 million rows, 20 columns all integer, smallint or
> char(2). Most have an index. It's a table for holding webserver
> logs. The main table is all foreign key ids. Row size is ~100bytes.
>
> The typical query is an aggregate over a large number of rows (~25%
say).
>
> SELECT COUNT(*), COUNT(DISTINCT user_id)
> FROM table
> WHERE epoch > ...
> AND epoch < ...
> AND country = ...
>
> The box has 4 x Opterons, 4Gb RAM & five 15k rpm disks, RAID 5. We
> wanted fast query/lookup. We know we can get fast disk IO.
>
> Running a typical query like above seems to:
>
> * hardly tax a single CPU
> * plenty of RAM free
> * disks thrash about
>
> The last is based mostly on the observation that another tiddly
> unrelated mysql db which normally runs fast, grinds to a halt when
> we're querying the postgres db (and cpu, memory appear to have spare
> capacity).
>
> We've currently got these settings, and have tried doubling/halving
> them, restarted and benchmarked a test query. They don't appear to
> materially alter our query time.
>
> shared_buffers = 128MB

shared_buffers = 1.5GB

Unless you've got a lot of stuff other than PostgreSQL on this machine.

> temp_buffers = 160MB
> work_mem = 200MB
> max_stack_depth = 7MB

These look reasonable, although I can't be sure without more details.

>
> We're less concerned about insert speed. Typically 1 or 2 users, but
> want fast queries.
>
> Perhaps a little extreme, but I'm trying to find a way to express this
> in a way that Postgres understands:
>
> * Load this table, and one or two indexes (epoch, user_id) into RAM.

Give it enough shared_buffers and it will do that. You're estimating
the size of your table @ 3G (try a pg_relation_size() on it to get an
actual size) If you really want to get _all_ of it in all the time,
you're probably going to need to add RAM to the machine. With 8G, you
could allocate about 3G to shared_buffers, but that would be ignoring
the size of indexes.

However, I think you'll be surprised how much performance improves
with 1.5G of shared_buffers. You may not need any more. 128M is
really forcing PG to work within limited space.

> * All of the table, all of those indexes.
> * Keep them there, but keep a disk based backup for integrity.
> * Run all selects against the in RAM copy. Always.

This is what PG does if you allocate enough shared_buffers.

--
Bill Moran
http://www.potentialtech.com

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Confidentiality Notice. This message may contain information that is confidential or otherwise protected from disclosure.
If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution,
or copying of this message, or any attachments, is strictly prohibited. If you have received this message in error,
please advise the sender by reply e-mail, and delete the message and any attachments. Thank you.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Carlos Moreno 2007-09-26 15:35:14 Re: Autostart PostgreSQL in Ubuntu
Previous Message Carlos Moreno 2007-09-26 15:26:01 Re: Dumping from older version