Re: High load and iowait but no disk access

From: Rémy Beaumont <remyb(at)medrium(dot)com>
To: "Anjan Dave" <adave(at)vantage(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org> <pgsql-performance(at)postgresql(dot)org>
Subject: Re: High load and iowait but no disk access
Date: 2005-08-30 18:50:35
Message-ID: 6bb032ec58fb35eb6e5a3cefd3dd9360@medrium.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 30-Aug-05, at 14:46, Anjan Dave wrote:

> I have seen references of changing the kernel io scheduler at boot
> time…not sure if it applies to RHEL3.0, or will help, but try setting
> ‘elevator=deadline’ during boot time or via grub.conf.
That's only for RHEL 4.0.

> Have you tried running a simple ‘dd’ on the LUN?
We get amazing performance using dd.
> The drives are in RAID10 configuration, right?
NetApp has their own type of raid format (RAID4 aka WAFL)

Rémy
>  
> Thanks,
> Anjan
>
> From: Woody Woodring [mailto:george(dot)woodring(at)iglass(dot)net]
> Sent: Tuesday, August 30, 2005 2:30 PM
> To: 'Rémy Beaumont'; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] High load and iowait but no disk access
>  
> Have you tried a different kernel?  We run with a netapp over NFS
> without any issues, but we have seen high IO-wait on other Dell boxes
> (running  and not running postgres) and RHES 3.  We have replaced a
> Dell PowerEdge 350 running RH 7.3  with a PE750 with more memory
> running RHES3 and it be bogged down with IO waits due to syslog
> messages writing to the disk, the old slower server could handle it
> fine.  I don't know if it is a Dell thing or a RH kernel, but we try
> different kernels on our boxes to try to find one that works better. 
> We have not found one that stands out over another consistently but we
> have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to
> server lockup issues.  Unfortunately we get the best disk throughput
> on our few remaining 7.3 boxes.
>  
> Woody
>  
> IGLASS Networks
> www.iglass.net
>  
>
>
> From: pgsql-performance-owner(at)postgresql(dot)org
> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Rémy
> Beaumont
> Sent: Monday, August 29, 2005 9:43 AM
> To: pgsql-performance(at)postgresql(dot)org
> Subject: [PERFORM] High load and iowait but no disk access
> We have been trying to pinpoint what originally seem to be a I/O
> bottleneck but which now seems to be an issue with either Postgresql
> or RHES 3.
>
> We have the following test environment on which we can reproduce the
> problem:
>
> 1) Test System A
> Dell 6650 Quad Xeon Pentium 4
> 8 Gig of RAM
> OS: RHES 3 update 2
> Storage: NetApp FAS270 connected using an FC card using 10 disks
>
> 2) Test System B
> Dell Dual Xeon Pentium III
> 2 Gig o RAM
> OS: RHES 3 update 2
> Storage: NetApp FAS920 connected using an FC card using 28 disks
>
> Our Database size is around 30G.
>
> The behavior we see is that when running queries that do random reads
> on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a
> throughput bellow 3000kB/s (We usually average 40000 kB/s to 80000
> kB/s on sequential read operations on the netapps)
>
> The stats of the NetApp do confirm that it is sitting idle. Doing an
> strace on the Postgresql process shows that is it doing seeks and
> reads.
>
> So my question is where is this iowait time spent ?
> Is there a way to pinpoint the problem in more details ?
> We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3
>
> I have included the output of top,vmstat,strace and systat from the
> Netapp from System B while running a single query that generates this
> behavior.
>
> Rémy
>
> top output:
> 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01
> 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped
> CPU states: cpu user nice system irq softirq iowait idle
> total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5%
> cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%
> cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9%
> Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff
> 1566332k actv, 296648k in_d, 30504k in_c
> Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached
>
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
> 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster
> 30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd
> 1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init
> 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
> 3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
> 4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd
> 5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
> 6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
> 9 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush
> 7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd
> 8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand
> 10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated
> 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
> 17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0
>
>
> vmstat output
> procs memory swap io system cpu
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 3
> 0 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 47
> 0 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 47
> 1 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 46
> 0 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 46
> 0 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 46
> 0 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 46
> 1 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 47
> 0 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 48
> 0 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 49 46
> 0 1 21552 18320 4944 1931016 0 4 1500 840 414 571 1 1 48 50
> 0 1 21552 17872 4944 1931440 0 0 2116 0 392 496 1 5 46 48
> 0 1 21552 18060 4944 1931232 0 0 2232 0 423 597 1 2 48 49
> 1 1 21552 17684 4944 1931584 0 0 1752 0 395 537 1 1 50 48
> 0 1 21552 18000 4944 1931240 0 0 1576 0 401 549 0 1 50 49
>
>
> NetApp stats:
> CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP
> CP Disk DAFS FCP iSCSI FCP kB/s
> in out read write read write age hit time ty util in out
> 2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 0 139 0 3 2277
> 2% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 2150
> 2% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 1879
> 3% 0 0 0 169 0 0 2937 80 0 0 3 96% 0% - 13% 0 169 0 4 2718
> 2% 0 0 0 139 0 0 2448 0 0 0 3 96% 0% - 12% 0 139 0 3 2096
> 2% 0 0 0 137 0 0 2116 0 0 0 3 96% 0% - 10% 0 137 0 3 1892
> 3% 0 0 0 107 0 0 2660 812 0 0 3 96% 24% T 20% 0 107 0 3 1739
> 2% 0 0 0 118 0 0 1788 0 0 0 3 96% 0% - 13% 0 118 0 3 1608
> 2% 0 0 0 136 0 0 2228 0 0 0 3 96% 0% - 11% 0 136 0 3 2018
> 2% 0 0 0 119 0 0 1940 0 0 0 3 96% 0% - 13% 0 119 0 3 1998
> 2% 0 0 0 136 0 0 2175 0 0 0 3 96% 0% - 14% 0 136 0 3 1929
> 2% 0 0 0 133 0 0 1924 0 0 0 3 96% 0% - 19% 0 133 0 3 2292
> 2% 0 0 0 115 0 0 2044 0 0 0 3 96% 0% - 11% 0 115 0 3 1682
> 2% 0 0 0 134 0 0 2256 0 0 0 3 96% 0% - 12% 0 134 0 3 2096
> 2% 0 0 0 112 0 0 2184 0 0 0 3 96% 0% - 12% 0 112 0 3 1633
> 2% 0 0 0 163 0 0 2348 0 0 0 3 96% 0% - 13% 0 163 0 4 2421
> 2% 0 0 0 120 0 0 2056 184 0 0 3 96% 8% T 14% 0 120 0 3 1703
>
> strace output:
> read(55, "\4\0\0\0\10fm}\1\0\0\0p\0\264\0\0 \2 \230\236\320\0020"...,
> 8192) = 8192
> _llseek(55, 857997312, [857997312], SEEK_SET) = 0
> read(55, "\4\0\0\0\\\315\321|\1\0\0\0p\0\354\0\0 \2 \250\236\260"...,
> 8192) = 8192
> _llseek(55, 883220480, [883220480], SEEK_SET) = 0
> read(55, "\4\0\0\0T\17a~\1\0\0\0p\0\20\1\0 \2
> \270\236\220\2D\235"..., 8192) = 8192
> _llseek(55, 858005504, [858005504], SEEK_SET) = 0
> read(55, "\4\0\0\0\300\356\321|\1\0\0\0p\0\330\0\0 \2
> \260\236\240"..., 8192) = 8192
> _llseek(55, 857964544, [857964544], SEEK_SET) = 0
> read(55, "\4\0\0\0lH\321|\1\0\0\0p\0<\1\0 \2 \300\236\200\2p\235"...,
> 8192) = 8192
> _llseek(55, 857956352, [857956352], SEEK_SET) = 0
> read(55, "\4\0\0\0l\'\321|\1\0\0\0p\0\320\0\0 \2
> \260\236\240\2\\"..., 8192) = 8192
> _llseek(55, 910802944, [910802944], SEEK_SET) = 0
> read(55, "\4\0\0\0\10}\25\200\1\0\0\0l\0\274\1\0 \2 \250\236\260"...,
> 8192) = 8192
> _llseek(55, 857948160, [857948160], SEEK_SET) = 0
> read(55, "\4\0\0\0\370\5\321|\1\0\0\0p\0\350\0\0 \2 \230\236\320"...,
> 8192) = 8192
> _llseek(56, 80371712, [80371712], SEEK_SET) = 0
> read(56, "\4\0\0\0Lf \217\1\0\0\0p\0\f\1\0 \2
> \250\236\260\2T\235"..., 8192) = 8192
> read(102,
> "\2\0\34\0001\236\0\0\1\0\0\0\t\0\0\00020670\0\0\0B\6\0"..., 8192) =
> 8192
> _llseek(55, 857939968, [857939968], SEEK_SET) = 0
> read(55, "\4\0\0\0\244\344\320|\1\0\0\0l\0\230\1\0 \2
> \244\236\270"..., 8192) = 8192
> _llseek(55, 857923584, [857923584], SEEK_SET) = 0
> read(55, "\4\0\0\0\224\242\320|\1\0\0\0p\0|\0\0 \2
> \234\236\310\002"..., 8192) = 8192
> _llseek(55, 57270272, [57270272], SEEK_SET) = 0
> read(55, "\4\0\0\0\3204FK\1\0\0\0t\0\340\0\0 \2
> \310\236j\2\214\235"..., 8192) = 8192
> _llseek(55, 870727680, [870727680], SEEK_SET) = 0
> read(55, "\4\0\0\0x>\233}\1\0\0\0p\0(at)\1\0 \2 \250\236\260\2X\235"...,
> 8192) = 8192
> _llseek(55, 1014734848, [1014734848], SEEK_SET) = 0
> read(55, "\4\0\0\0\34\354\201\206\1\0\0\0p\0p\0\0 \2
> \264\236\230"..., 8192) = 8192
> _llseek(55, 857874432, [857874432], SEEK_SET) = 0
> read(55, "\4\0\0\0\214\331\317|\1\0\0\0l\0\324\1\0 \2
> \224\236\330"..., 8192) = 8192
> _llseek(55, 760872960, [760872960], SEEK_SET) = 0
> read(55, "\4\0\0\0\30\257\321v\1\0\0\0p\0\230\0\0 \2
> \234\236\310"..., 8192) = 8192
> _llseek(55, 891715584, [891715584], SEEK_SET) = 0
> read(55, "\4\0\0\0\370\220\347~\1\0\0\0p\0P\1\0 \2
> \230\236\320\2"..., 8192) = 8192
> _llseek(55, 857858048, [857858048], SEEK_SET) = 0
> read(55, "\4\0\0\0\250\227\317|\1\0\0\0p\0\264\0\0 \2
> \254\236\250"..., 8192) = 8192
> _llseek(55, 666910720, [666910720], SEEK_SET) = 0
> read(55, "\4\0\0\0x\206\3q\1\0\0\0p\0004\1\0 \2
> \254\236\242\2P\235"..., 8192) = 8192
> _llseek(55, 857841664, [857841664], SEEK_SET) = 0
> read(55, "\4\0\0\0dT\317|\1\0\0\0p\0\224\0\0 \2
> \214\236\350\2\30"..., 8192) = 8192
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Joshua D. Drake 2005-08-30 19:27:37 Re: RAID Configuration Sugestion
Previous Message Matthew Nuzum 2005-08-30 18:50:16 Re: RAID Configuration Sugestion