Re: amazon ec2

From: Jim Nasby <jim(at)nasby(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Mark Rostron <mrostron(at)ql2(dot)com>, Alan Hodgson <ahodgson(at)simkin(dot)ca>, pgsql-performance(at)postgresql(dot)org
Subject: Re: amazon ec2
Date: 2011-05-04 13:05:35
Message-ID: 00C0C53C-7042-4C05-B6CD-68BFDF400C76@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On May 3, 2011, at 5:39 PM, Greg Smith wrote:
> I've also seen over a 20:1 speedup over PostgreSQL by using Greenplum's free Community Edition server, in situations where its column store + compression features work well on the data set. That's easiest with an append-only workload, and the data set needs to fit within the constraints where indexes on compressed data are useful. But if you fit the use profile it's good at, you end up with considerable ability to trade-off using more CPU resources to speed up queries. It effectively increases the amount of data that can be cached in RAM by a large multiple, and in the EC2 context (where any access to disk is very slow) it can be quite valuable.

FWIW, EnterpriseDB's "InfiniCache" provides the same caching benefit. The way that works is when PG goes to evict a page from shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given page isn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you to cache amounts of data that far exceed the amount of memory you could put in a physical server.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2011-05-04 13:28:03 Re: REINDEX takes half a day (and still not complete!)
Previous Message Willy-Bas Loos 2011-05-04 10:31:29 Re: [PERFORMANCE] expanding to SAN: which portion best to move