Re: Using pgiosim realistically

From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: John Rouillard <rouilj(at)renesys(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using pgiosim realistically
Date: 2011-05-14 17:07:02
Message-ID: 20110514170702.GC32389@staff-mud-56-27.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, May 13, 2011 at 09:09:41PM +0000, John Rouillard wrote:
> Hi all:
>
> I am adding pgiosim to our testing for new database hardware and I am
> seeing something I don't quite get and I think it's because I am using
> pgiosim incorrectly.
>
> Specs:
>
> OS: centos 5.5 kernel: 2.6.18-194.32.1.el5
> memory: 96GB
> cpu: 2x Intel(R) Xeon(R) X5690 @ 3.47GHz (6 core, ht enabled)
> disks: WD2003FYYS RE4
> raid: lsi - 9260-4i with 8 disks in raid 10 configuration
> 1MB stripe size
> raid cache enabled w/ bbu
> disk caches disabled
> filesystem: ext3 created with -E stride=256
>
> I am seeing really poor (70) iops with pgiosim. According to:
> http://www.tomshardware.com/reviews/2tb-hdd-7200,2430-8.html in the
> database benchmark they are seeing ~170 iops on a single disk for
> these drives. I would expect an 8 disk raid 10 should get better then
> 3x the single disk rate (assuming the data is randomly distributed).
>
> To test I am using 5 100GB files with
>
> sudo ~/pgiosim -c -b 100G -v file?
>
> I am using 100G sizes to make sure that the data read and files sizes
> exceed the memory size of the system.
>
> However if I use 5 1GB files (and still 100GB read data) I see 200+ to
> 400+ iops at 50% of the 100GB of data read, which I assume means that
> the data is cached in the OS cache and I am not really getting hard
> drive/raid I/O measurement of iops.
>
> However, IIUC postgres will never have an index file greater than 1GB
> in size
> (http://www.postgresql.org/docs/8.4/static/storage-file-layout.html)
> and will just add 1GB segments, so the 1GB size files seems to be more
> realistic.
>
> So do I want 100 (or probably 2 or 3 times more say 300) 1GB files to
> feed pgiosim? That way I will have enough data that not all of it can
> be cached in memory and the file sizes (and file operations:
> open/close) more closely match what postgres is doing with index
> files?
>
> Also in the output of pgiosim I see:
>
> 25.17%, 2881 read, 0 written, 2304.56kB/sec 288.07 iops
>
> which I interpret (left to right) as the % of the 100GB that has been
> read, the number of read operations over some time period, number of
> bytes read/written and the io operations/sec. Iops always seems to be
> 1/10th of the read number (rounded up to an integer). Is this
> expected and if so anybody know why?
>
> While this is running if I also run "iostat -p /dev/sdc 5" I see:
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sdc 166.40 2652.80 4.80 13264 24
> sdc1 2818.80 1.20 999.20 6 4996
>
> which I am interpreting as 2818 read/io operations (corresponding more
> or less to read in the pgiosim output) to the partition and of those
> only 116 are actually going to the drive??? with the rest handled from
> OS cache.
>
> However the tps isn't increasing when I see pgiosim reporting:
>
> 48.47%, 4610 read, 0 written, 3687.62kB/sec 460.95 iops
>
> an iostat 5 output near the same time is reporting:
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sdc 165.87 2647.50 4.79 13264 24
> sdc1 2812.97 0.60 995.41 3 4987
>
> so I am not sure if there is a correlation between the read and tps
> settings.
>
> Also I am assuming blks written is filesystem metadata although that
> seems like a lot of data
>
> If I stop the pgiosim, the iostat drops to 0 write and reads as
> expected.
>
> So does anybody have any comments on how to test with pgiosim and how
> to correlate the iostat and pgiosim outputs?
>
> Thanks for your feedback.
> --
> -- rouilj
>
> John Rouillard System Administrator
> Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111
>

Hi John,

Those drives are 7200 rpm drives which would give you a maximum write
rate of 120/sec at best with the cache disabled. I actually think your
70/sec is closer to reality and what you should anticipate in real use.
I do not see how they could make 170/sec. Did they strap a jet engine to
the drive. :)

Regards,
Ken

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stuart Bishop 2011-05-15 03:49:02 Re: reducing random_page_cost from 4 to 2 to force index scan
Previous Message Stefan Keller 2011-05-14 10:10:32 KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)