Re: Hardware/OS recommendations for large databases (

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Dave Cramer" <pg(at)fastcrypt(dot)com>
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, "Joshua Marsh" <icub3d(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware/OS recommendations for large databases (
Date: 2005-11-18 15:13:42
Message-ID: BFA32FA6.14027%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dave,

On 11/18/05 5:00 AM, "Dave Cramer" <pg(at)fastcrypt(dot)com> wrote:
>
> Now there's an interesting line drawn in the sand. I presume you have
> numbers to back this up ?
>
> This should draw some interesting posts.

Part 2: The answer

System A:
> This system is running RedHat 3 Update 4, with a Fedora 2.6.10 Linux kernel.
>
> On a single table with 15 columns (the Bizgres IVP) at a size double memory
> (2.12GB), Postgres 8.0.3 with Bizgres enhancements takes 32 seconds to scan
> the table: that¹s 66 MB/s. Not the efficiency I¹d hope from the onboard SATA
> controller that I¹d like, I would have expected to get 85% of the 100MB/s raw
> read performance.
>
> So that¹s $1,200 / 66 MB/s (without adjusting for 2003 price versus now) =
> 18.2 $/MB/s
>
> Raw data:
> [llonergan(at)kite4 IVP]$ cat scan.sh
> #!/bin/bash
>
> time psql -c "select count(*) from ivp.bigtable1" dgtestdb
> [llonergan(at)kite4 IVP]$ cat sysout1
> count
> ----------
> 10000000
> (1 row)
>
>
> real 0m32.565s
> user 0m0.002s
> sys 0m0.003s
>
> Size of the table data:
> [llonergan(at)kite4 IVP]$ du -sk dgtestdb/base
> 2121648 dgtestdb/base
>
System B:
> This system is running an XFS filesystem, and has been tuned to use very large
> (16MB) readahead. It¹s running the Centos 4.1 distro, which uses a Linux
> 2.6.9 kernel.
>
> Same test as above, but with 17GB of data takes 69.7 seconds to scan (!)
> That¹s 244.2MB/s, which is obviously double my earlier point of 110-120MB/s.
> This system is running with a 16MB Linux readahead setting, let¹s try it with
> the default (I think) setting of 256KB ­ AHA! Now we get 171.4 seconds or
> 99.3MB/s.
>
> So, using the tuned setting of ³blockdev ‹setra 16384² we get $6,000 / 244MB/s
> = 24.6 $/MB/s
> If we use the default Linux setting it¹s 2.5x worse.
>
> Raw data:
> [llonergan(at)modena2 IVP]$ cat scan.sh
> #!/bin/bash
>
> time psql -c "select count(*) from ivp.bigtable1" dgtestdb
> [llonergan(at)modena2 IVP]$ cat sysout3
> count
> ----------
> 80000000
> (1 row)
>
>
> real 1m9.875s
> user 0m0.000s
> sys 0m0.004s
> [llonergan(at)modena2 IVP]$ !du
> du -sk dgtestdb/base
> 17021260 dgtestdb/base

Summary:

<cough, cough> OK ­ you can get more I/O bandwidth out of the current I/O
path for sequential scan if you tune the filesystem for large readahead.
This is a cheap alternative to overhauling the executor to use asynch I/O.

Still, there is a CPU limit here ­ this is not I/O bound, it is CPU limited
as evidenced by the sensitivity to readahead settings. If the filesystem
could do 1GB/s, you wouldn¹t go any faster than 244MB/s.

- Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Cramer 2005-11-18 15:25:52 Re: Hardware/OS recommendations for large databases (
Previous Message Ron 2005-11-18 15:00:56 Re: Hardware/OS recommendations for large databases