Re: PG writes a lot to the disk

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: "Laurent Raufaste" <analogue(at)glop(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PG writes a lot to the disk
Date: 2008-03-19 13:49:22
Message-ID: 20080319094922.982e8abb.wmoran@collaborativefusion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In response to "Laurent Raufaste" <analogue(at)glop(dot)org>:

> I have a big PG server dedicated to serve only SELECT queries.
> The database is updated permanently using Slony.
>
> The server has 8 Xeon cores running at 3Ghz, 24GB of RAM and the
> following disk arrays:
> - one RAID1 serving the OS and the pg_xlog
> - one RAID5 serving the database and the tables (base directory)
> - one RAID5 serving the indexes (indexes have an alternate tablespace)
>
> This server can't take anything, it writes too much.
>
> When I try to plug it to a client (sending 20
> transactions/s) it works fine for like 10 minutes, then start to write
> a lot in the pgdata/base directory (where the database files are, not
> the index).
>
> It writes so much (3MB/s randomly) that it can't serve the queries anymore, the
> load is huge.
>
> In order to locate the problem, I stopped Slony (no updates anymore),
> mounted the database and index partitions with the sync option (no FS
> write cache), and the problem happens faster, like 2 minutes after
> having plugged the client (and the queries) to it.
> I can reproduce the problem at will.
>
> I tried to see if some file size were increasing a lot, and found
> nothing more than the usual DB increase (DB is constantly updated by
> Slony).
>
> What does it writes so much in the base directory ? If it's some
> temporary table or anything, how can I locate it so I can fix the
> problem ?

My guess (based on the information you provided) is that it's temporary
sort file usage. If you're using 8.3 there's a config option to log
each time a sort file is required. Anything earlier than 8.3 and you'll
have to rely on your OS tools to track it down.

However, what makes you so sure it's write activity? I see no evidence
attached to this email (iostat or similar output) so I'm wondering if
it's actually read activity.

Check your log levels, if you turn up PG's logging all the way, it generates
a LOT of write activity ... more than you might imagine under some loads.

Get rid of the RAID 5. RAID 5 sucks. Have you tried running bonnie++ or
similar to see if it's not just a really crappy RAID 5 controller?

> Here's the PG memory configuration:
> max_connections = 128
> shared_buffers = 2GB

Have you tuned this based on experience? Current best practices would
recommend that you start with ~6G (1/4 RAM) and tune up/down as experience
with your workload dictates.

> temp_buffers = 8MB
> work_mem = 96MB

Considering you've got 24G of RAM, you might want to try bumping this and
see if it helps without pushing the system into swap. If the problem
is sort file usage, this is the option to tune it.

> maintenance_work_mem = 4GB

I doubt it's hurting anything, but I don't think a value this high will
actually be used.

> max_stack_depth = 7MB
> default_statistics_target = 100
> effective_cache_size = 20GB

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran(at)collaborativefusion(dot)com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bill Moran 2008-03-19 13:55:59 Re: question on TRUNCATE vs VACUUM FULL
Previous Message Mark Steben 2008-03-19 13:43:09 Re: question on TRUNCATE vs VACUUM FULL