Re: Hardware/OS recommendations for large databases (

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware/OS recommendations for large databases (
Date: 2005-11-24 17:07:40
Message-ID: BFAB335C.14684%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The same 12.9GB distributed across 4 machines using Bizgres MPP fits into
I/O cache. The interesting result is that the query "select count(1)" is
limited in speed to 280 MB/s per CPU when run on the lineitem table. So
when I run it spread over 4 machines, one CPU per machine I get this:

======================================================
Bizgres MPP, 4 data segments, 1 per 2 CPUs
======================================================
llonergan=# explain select count(1) from lineitem;
QUERY PLAN
----------------------------------------------------------------------------
----------
Aggregate (cost=582452.00..582452.00 rows=1 width=0)
-> Gather Motion (cost=582452.00..582452.00 rows=1 width=0)
-> Aggregate (cost=582452.00..582452.00 rows=1 width=0)
-> Seq Scan on lineitem (cost=0.00..544945.00 rows=15002800
width=0)
(4 rows)

llonergan=# \timing
Timing is on.
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 12191.435 ms
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 11986.109 ms
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 11448.941 ms
======================================================

That's 12,937 MB in 11.45 seconds, or 1,130 MB/s. When you divide out the
number of Postgres instances (4), that's 283MB/s per Postgres instance.

To verify that this has nothing to do with MPP, I ran it in a special
internal mode on one instance and got the same result.

So - we should be able to double this rate by running one segment per CPU,
or two per host:

======================================================
Bizgres MPP, 8 data segments, 1 per CPU
======================================================
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 6484.594 ms
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 6156.729 ms
llonergan=# select count(1) from lineitem;
count
----------
59986052
(1 row)

Time: 6063.416 ms
======================================================
That's 12,937 MB in 11.45 seconds, or 2,134 MB/s. When you divide out the
number of Postgres instances (8), that's 267MB/s per Postgres instance.

So, if you want to "select count(1)", using more CPUs is a good idea! For
most complex queries, having lots of CPUs + MPP is a good combo.

Here is an example of a sorting plan - this should probably be done with a
hash aggregation, but using 8 CPUs makes it go 8x faster:

- Luke

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bealach-na Bo 2005-11-24 18:14:53 Re: Very slow queries - please help
Previous Message Greg Stark 2005-11-24 16:40:21 Re: Hardware/OS recommendations for large databases (