Re: Horizontal Write Scaling

From: Eliot Gable <egable+pgsql-hackers(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Horizontal Write Scaling
Date: 2010-11-27 17:04:55
Message-ID: AANLkTimGuETOHVntCxoBDyZk28JRTVTC5T1mz4mQkj54@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks, everyone, for all the feedback! I am nowhere near a database expert
yet, but you guys have been very helpful in clearing up some of my
confusion. I have checked out Postgres-XC and it looks like the version 1.0
that will be released soon probably covers everything I have been looking
for in terms of Postgres capabilities. The big ones are write scaling, read
scaling, consistent view of data between each server, and HA capabilities.
Last time I looked at Postgres-XC was probably a year ago, and it was
nowhere close to what I was looking for at the time, and I forgot all about
it. Now, it looks like a real contender.

I was aware of Postgres-R and was actually thinking I might be able to get
away with using that, but the project I am working on does a substantial
amount of writing, as well as being CPU intensive. Each query executes a
stored procedure which is about 2,500 lines long and pulls data from about
80 tables to compute a final result set. That final result set is returned
to the requester, and is also written into 3 tables (while still inside the
original transaction). One of those tables gets one row while the other two
get 6 - 15 rows per query. I execute hundreds of these queries per second.
So, I need to be able to spread the load across multiple boxes due to CPU
usage, but still have a consistent view of the written data. Using
conventional drives, I would saturate the disk I/O pretty quickly on
commodity hardware. With normal multi-master replication, the cost of making
sure I have enough disk I/O on each server is way more than I have the
budget for. With a write scaling solution, it suddenly looks affordable. I
was looking at maybe getting a single shared RAID array with some
enterprise-class SSDs that could guarantee writes even during a power
failure. I was hoping I could find something that would let multiple
Postgres instances share that disk array as it would be more cost effective
to get both the CPU power and Disk I/O I needed than sticking such a RAID
array in each and every server I was going to spread load across.
Postgres-XC actually makes it look even more affordable, as I now probably
no longer need to consider SSDs, or at least I don't need to consider a RAID
10 array of 4 or more SSDs per box. I can probably do RAID 1 with 2 drives
per box and have plenty of Disk I/O available for the amount of CPU power I
would have in the boxes.

So, thanks again for the feedback.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-11-27 18:41:32 Re: Amazon now supporting GPU focused EC2 instances
Previous Message Bruce Momjian 2010-11-27 16:56:19 Re: changing MyDatabaseId