Re: disk performance benchmarks

From: "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
To: Vivek Khera <khera(at)kcilink(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: disk performance benchmarks
Date: 2004-09-14 18:11:38
Message-ID: 1095185498.5792.12.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Tue, 2004-09-14 at 10:28, Vivek Khera wrote:
> >>>>> "SW" == Shane Wright <Shane> writes:
>
> SW> But, we have now taken the plunge and I'm in a position to do some
> SW> benchmarking to actually get some data. Basically I was wondering if
> SW> anyone else had any particular recommendations (or requests) about the
> SW> most useful kinds of benchmarks to do.
>
> I did a bunch of benchmarking on a 14 disk SCSI RAID array comparing
> RAID 5, 10, and 50. My tests consisted of doing a full restore of a
> 30Gb database (including indexes) and comparing the times to do the
> restore, the time to make the indexes, and the time to vacuum. Then I
> ran a bunch of queries.
>
> It was damn near impossible to pick a 'better' RAID config, so I just
> went with RAID5.
>
> You can find many of my posts on this topic on the list archives from
> about august - october of last year.
>
> Basically, you have to approach it holistically to tune the system: Pg
> config parameters, memory, and disk speed are the major factors.
>
> That and your schema needs to be not idiotic. :-)

I've recently bee frustrated by this topic, because it seems like you
can design the hell out of a system, getting everything tuned with micro
and macro benchmarks, but when you put it in production the thing falls
apart.

Current issue:

A dual 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5
arrays (one for database, one for xlogs). PG's config is extremely
generous, and in isolated benchmarks it's very fast.

But, in reality, performance is abyssmal. There's something about what
PG does inside commits and checkpoints that sends Linux into a catatonic
state. For instance here's a snapshot of vmstat during a parallel heavy
select/insert load:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
3 0 216 13852 39656 7739724 0 0 820 2664 2868 2557 16 2 74 7
0 0 216 17580 39656 7736460 0 0 3024 4700 3458 4313 42 6 52 0
0 0 216 16428 39676 7737324 0 0 840 4248 3930 4516 0 4 89 8
0 1 216 18620 39672 7736920 0 0 7576 516 2738 3347 1 4 55 39
0 0 216 14972 39672 7738960 0 0 1992 2532 2509 2288 2 3 93 3
0 0 216 13564 39672 7740592 0 0 1640 2656 2581 2066 1 3 97 0
0 0 216 12028 39672 7742292 0 0 1688 3576 2072 1626 1 2 96 0
0 0 216 18364 39680 7736164 0 0 1804 3372 1836 1379 1 4 96 0
0 0 216 16828 39684 7737588 0 0 1432 2756 2256 1720 1 3 94 2
0 0 216 15452 39684 7738812 0 0 1188 2184 2384 1830 1 2 97 0
0 1 216 15388 39684 7740104 0 0 1336 2628 2490 1974 2 3 94 2
6 0 216 15424 39684 7740240 0 0 104 3472 2757 1940 3 2 92 2
0 0 216 14784 39700 7741856 0 0 1668 3320 2718 2332 0 3 97 0

You can see there's not much progress being made there. In the
presence of a farily pathetic writeout, there's a tiny trickle of disk
reads, userspace isn't making any progress, the kernel isn't busy, and
few processes are in iowait. So what the heck is going on?

This state of non-progress persists as long as the checkpoint subprocess
is active. I'm sure there's some magic way to improve this but I
haven't found it yet.

PS this is with Linux 2.6.7.

Regards,
jwb

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ying Lu 2004-09-14 20:14:33 Changed a column type from "integer" to varchar
Previous Message Greg Stark 2004-09-14 17:55:32 Re: Mail delivery failed: returning message to sender

Browse pgsql-performance by date

  From Date Subject
Next Message Vivek Khera 2004-09-14 18:19:33 Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Previous Message Vivek Khera 2004-09-14 17:28:59 Re: disk performance benchmarks