Skip site navigation (1) Skip section navigation (2)

Re: Using pgiosim realistically

From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: John Rouillard <rouilj(at)renesys(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using pgiosim realistically
Date: 2011-05-14 17:07:02
Message-ID: 20110514170702.GC32389@staff-mud-56-27.rice.edu (view raw or flat)
Thread:
Lists: pgsql-performance
On Fri, May 13, 2011 at 09:09:41PM +0000, John Rouillard wrote:
> Hi all:
> 
> I am adding pgiosim to our testing for new database hardware and I am
> seeing something I don't quite get and I think it's because I am using
> pgiosim incorrectly.
> 
> Specs:
> 
>   OS: centos 5.5 kernel: 2.6.18-194.32.1.el5
>   memory: 96GB
>   cpu: 2x Intel(R) Xeon(R) X5690  @ 3.47GHz (6 core, ht enabled)
>   disks: WD2003FYYS RE4
>   raid: lsi - 9260-4i with 8 disks in raid 10 configuration
>               1MB stripe size
>               raid cache enabled w/ bbu
>               disk caches disabled
>   filesystem: ext3 created with -E stride=256
> 
> I am seeing really poor (70) iops with pgiosim.  According to:
> http://www.tomshardware.com/reviews/2tb-hdd-7200,2430-8.html in the
> database benchmark they are seeing ~170 iops on a single disk for
> these drives. I would expect an 8 disk raid 10 should get better then
> 3x the single disk rate (assuming the data is randomly distributed).
> 
> To test I am using 5 100GB files with
> 
>     sudo ~/pgiosim -c -b 100G -v file?
> 
> I am using 100G sizes to make sure that the data read and files sizes
> exceed the memory size of the system.
> 
> However if I use 5 1GB files (and still 100GB read data) I see 200+ to
> 400+ iops at 50% of the 100GB of data read, which I assume means that
> the data is cached in the OS cache and I am not really getting hard
> drive/raid I/O measurement of iops.
> 
> However, IIUC postgres will never have an index file greater than 1GB
> in size
> (http://www.postgresql.org/docs/8.4/static/storage-file-layout.html)
> and will just add 1GB segments, so the 1GB size files seems to be more
> realistic.
> 
> So do I want 100 (or probably 2 or 3 times more say 300) 1GB files to
> feed pgiosim? That way I will have enough data that not all of it can
> be cached in memory and the file sizes (and file operations:
> open/close) more closely match what postgres is doing with index
> files?
> 
> Also in the output of pgiosim I see:
> 
>   25.17%,   2881 read,      0 written, 2304.56kB/sec  288.07 iops
> 
> which I interpret (left to right) as the % of the 100GB that has been
> read, the number of read operations over some time period, number of
> bytes read/written and the io operations/sec. Iops always seems to be
> 1/10th of the read number (rounded up to an integer). Is this
> expected and if so anybody know why?
> 
> While this is running if I also run "iostat -p /dev/sdc 5" I see:
> 
>   Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>   sdc             166.40      2652.80         4.80      13264         24
>   sdc1           2818.80         1.20       999.20          6       4996
> 
> which I am interpreting as 2818 read/io operations (corresponding more
> or less to read in the pgiosim output) to the partition and of those
> only 116 are actually going to the drive??? with the rest handled from
> OS cache.
> 
> However the tps isn't increasing when I see pgiosim reporting:
> 
>    48.47%,   4610 read,      0 written, 3687.62kB/sec  460.95 iops
> 
> an iostat 5 output near the same time is reporting:
> 
>   Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>   sdc             165.87      2647.50         4.79      13264         24
>   sdc1           2812.97         0.60       995.41          3       4987
> 
> so I am not sure if there is a correlation between the read and tps
> settings.
> 
> Also I am assuming blks written is filesystem metadata although that
> seems like a lot of data 
> 
> If I stop the pgiosim, the iostat drops to 0 write and reads as
> expected.
> 
> So does anybody have any comments on how to test with pgiosim and how
> to correlate the iostat and pgiosim outputs?
> 
> Thanks for your feedback.
> -- 
> 				-- rouilj
> 
> John Rouillard       System Administrator
> Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111
> 

Hi John,

Those drives are 7200 rpm drives which would give you a maximum write
rate of 120/sec at best with the cache disabled. I actually think your
70/sec is closer to reality and what you should anticipate in real use.
I do not see how they could make 170/sec. Did they strap a jet engine to
the drive. :)

Regards,
Ken

In response to

Responses

pgsql-performance by date

Next:From: Stuart BishopDate: 2011-05-15 03:49:02
Subject: Re: reducing random_page_cost from 4 to 2 to force index scan
Previous:From: Stefan KellerDate: 2011-05-14 10:10:32
Subject: KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group