Re: Hardware/OS recommendations for large databases (

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>
Cc: stange(at)rentec(dot)com, "Greg Stark" <gsstark(at)mit(dot)edu>, "Dave Cramer" <pg(at)fastcrypt(dot)com>, "Joshua Marsh" <icub3d(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware/OS recommendations for large databases (
Date: 2005-11-24 08:17:06
Message-ID: BFAAB702.1460D%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Mark,

See the results below and analysis - the pure HeapScan gets 94.1% of the max
available read bandwidth (cool!). Nothing wrong with heapscan in the
presence of large readahead, which is good news.

That says it's something else in the path. As you probably know there is a
page lock taken, a copy of the tuple from the page, lock removed, count
incremented for every iteration of the agg node on a count(*). Is the same
true of a count(1)?

I recall that the profile is full of memcpy and memory context calls.

It would be nice to put some tracers into the executor and see where the
time is going. I'm also curious about the impact of the new 8.1 virtual
tuples in reducing the executor overhead. In this case my bet's on the agg
node itself, what do you think?

- Luke

On 11/21/05 9:10 PM, "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz> wrote:

> Luke Lonergan wrote:
>
>> So that leaves the question - why not more than 64% of the I/O scan rate?
>> And why is it a flat 64% as the I/O subsystem increases in speed from
>> 333-400MB/s?
>>
>
> It might be interesting to see what effect reducing the cpu consumption
> entailed by the count aggregation has - by (say) writing a little bit
> of code to heap scan the desired relation (sample attached).

OK - here are results for a slightly smaller (still bigger than RAM)
lineitem on the same machine, using the same xfs filesystem that achieved
407MB/s:

============================================================================
12.9GB of DBT-3 data from the lineitem table
============================================================================
llonergan=# select relpages from pg_class where relname='lineitem';
relpages
----------
1579270
(1 row)

1579270*8192/1000000
12937 Million Bytes or 12.9GB

llonergan=# \timing
Timing is on.
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 197870.105 ms
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 49912.164 ms
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 49218.739 ms

llonergan=# select fastcount('lineitem');
fastcount
-----------
59986052
(1 row)

Time: 33752.778 ms
llonergan=# select fastcount('lineitem');
fastcount
-----------
59986052
(1 row)

Time: 34543.646 ms
llonergan=# select fastcount('lineitem');
fastcount
-----------
59986052
(1 row)

Time: 34528.053 ms

============================================================================
Analysis:
============================================================================
Bandwidth Percent of max
dd Read 407MB/s 100%
Count(1) 263MB/s 64.6%
HeapScan 383MB/s 94.1%

Wow - looks like the HeapScan gets almost all of the available bandwidth!

- Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Kirkwood 2005-11-24 08:53:16 Re: Hardware/OS recommendations for large databases (
Previous Message Mark Kirkwood 2005-11-24 05:34:03 Re: Hardware/OS recommendations for large databases (