Quick Links

Re: 8K recordsize bad on ZFS?

From:	Jignesh Shah <jkshah(at)gmail(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: 8K recordsize bad on ZFS?
Date:	2010-05-08 21:39:02
Message-ID:	AANLkTimjfVo4s6TKu7-1BN5B-Wn4m3SnsUduRmaDxFAV@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Fri, May 7, 2010 at 8:09 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Jignesh, All:
>
> Most of our Solaris users have been, I think, following Jignesh's advice
> from his benchmark tests to set ZFS page size to 8K for the data zpool.
> However, I've discovered that this is sometimes a serious problem for
> some hardware.
>
> For example, having the recordsize set to 8K on a Sun 4170 with 8 drives
> recently gave me these appalling Bonnie++ results:
>
> Version 1.96 ------Sequential Output------ --Sequential Input-
> --Random-
> Concurrency 4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
> --Seeks--
> Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
> /sec %CP
> db111 24G 260044 33 62110 17 89914 15
> 1167 25
> Latency 6549ms 4882ms 3395ms
> 107ms
>
> I know that's hard to read. What it's saying is:
>
> Seq Writes: 260mb/s combined
> Seq Reads: 89mb/s combined
> Read Latency: 3.3s
>
> Best guess is that this is a result of overloading the array/drives with
> commands for all those small blocks; certainly the behavior observed
> (stuttering I/O, latency) is in line with that issue.
>
> Anyway, since this is a DW-like workload, we just bumped the recordsize
> up to 128K and the performance issues went away ... reads up over 300mb/s.
>
> --
> -- Josh Berkus
> PostgreSQL Experts Inc.
> http://www.pgexperts.com
>

Hi Josh,

The 8K recommendation is for OLTP Applications.. So if you seen
somewhere to use it for DSS/DW workload then I need to change it. DW
Workloads require throughput and if they use 8K then they are limited
by 8K x max IOPS which with 8 disk is about 120 (typical) x 8 SAS
drives which is roughly about 8MB/sec.. (Prefetching with read drives
and other optimizations can help it to push to about 24-30MB/sec with
8K on 12 disk arrays).. So yes that advice is typically bad for DSS..
And I believe I generally recommend them to use 128KB for DSS.So if
you have seen the 8K for DSS let me know and hopefully if I still have
access to it I can change it. However for OLTP you are generally want
more IOPS with low latency which is what 8K provides (The smallest
blocksize in ZFS).

Hope this clarifies.

-Jignesh

In response to

8K recordsize bad on ZFS? at 2010-05-08 00:09:45 from Josh Berkus

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jesper Krogh	2010-05-09 06:53:49	Re: Ugh - bad plan with LIMIT in a complex SELECT, any way to fix this?
Previous Message	Karl Denninger	2010-05-08 20:35:19	Ugh - bad plan with LIMIT in a complex SELECT, any way to fix this?