Re: Partitioning / Clustering

From: Alex Stapleton <alexs(at)advfn(dot)com>
To: John A Meinel <john(at)arbash-meinel(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Partitioning / Clustering
Date: 2005-05-10 15:10:30
Message-ID: DCFAEF09-885E-492C-A885-EEE405E7773F@advfn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 10 May 2005, at 15:41, John A Meinel wrote:

> Alex Stapleton wrote:
>
>> What is the status of Postgres support for any sort of multi-machine
>> scaling support? What are you meant to do once you've upgraded
>> your box
>> and tuned the conf files as much as you can? But your query load is
>> just too high for a single machine?
>>
>> Upgrading stock Dell boxes (I know we could be using better machines,
>> but I am trying to tackle the real issue) is not a hugely price
>> efficient way of getting extra performance, nor particularly scalable
>> in the long term.
>>
>
> Switch from Dell Xeon boxes, and go to Opterons. :) Seriously, Dell is
> far away from Big Iron. I don't know what performance you are looking
> for, but you can easily get into inserting 10M rows/day with quality
> hardware.

Better hardware = More Efficient != More Scalable

> But actually is it your SELECT load that is too high, or your INSERT
> load, or something inbetween.
>
> Because Slony is around if it is a SELECT problem.
> http://gborg.postgresql.org/project/slony1/projdisplay.php
>
> Basically, Slony is a Master/Slave replication system. So if you have
> INSERT going into the Master, you can have as many replicated slaves,
> which can handle your SELECT load.
> Slony is an asynchronous replicator, so there is a time delay from the
> INSERT until it will show up on a slave, but that time could be pretty
> small.

<snip>

>
>>
>> So, when/is PG meant to be getting a decent partitioning system?
>> MySQL
>> is getting one (eventually) which is apparently meant to be
>> similiar to
>> Oracle's according to the docs. Clusgres does not appear to be
>> widely/or at all used, and info on it seems pretty thin on the
>> ground,
>> so I am
>> not too keen on going with that. Is the real solution to multi-
>> machine
>> partitioning (as in, not like MySQLs MERGE tables) on PostgreSQL
>> actually doing it in our application API? This seems like a less
>> than
>> perfect solution once we want to add redundancy and things into
>> the mix.
>>
>
> There is also PGCluster
> http://pgfoundry.org/projects/pgcluster/
>
> Which is trying to be more of a Synchronous multi-master system. I
> haven't heard of Clusgres, so I'm guessing it is an older attempt,
> which
> has been overtaken by pgcluster.
>
> Just realize that clusters don't necessarily scale like you would want
> them too. Because at some point you have to insert into the same
> table,
> which means you need to hold a lock which prevents the other machine
> from doing anything. And with synchronous replication, you have to
> wait
> for all of the machines to get a copy of the data before you can
> say it
> has been committed, which does *not* scale well with the number of
> machines.

This is why I mention partitioning. It solves this issue by storing
different data sets on different machines under the same schema.
These seperate chunks of the table can then be replicated as well for
data redundancy and so on. MySQL are working on these things, but PG
just has a bunch of third party extensions, I wonder why these are
not being integrated into the main trunk :/ Thanks for pointing me to
PGCluster though. It looks like it should be better than Slony at least.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alex Stapleton 2005-05-10 15:24:23 Re: Partitioning / Clustering
Previous Message Adam Haberlach 2005-05-10 15:02:50 Re: Partitioning / Clustering