Re: Partitioning / Clustering

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Undisclosed(dot)Recipients: ;
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Partitioning / Clustering
Date: 2005-05-10 17:46:24
Message-ID: 200505101046.25108.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alex,

> This is why I mention partitioning. It solves this issue by storing  
> different data sets on different machines under the same schema.  

That's clustering, actually. Partitioning is simply dividing up a table into
chunks and using the chunks intelligently. Putting those chunks on seperate
machines is another thing entirely.

We're working on partitioning through the Bizgres sub-project:
www.bizgres.org / http://pgfoundry.org/projects/bizgres/
... and will be pushing it to the main PostgreSQL when we have something.

I invite you to join the mailing list.

> These seperate chunks of the table can then be replicated as well for  
> data redundancy and so on. MySQL are working on these things,

Don't hold your breath. MySQL, to judge by their first "clustering"
implementation, has a *long* way to go before they have anything usable. In
fact, at OSCON their engineers were asking Jan Wieck for advice.

If you have $$$ to shell out, my employer (GreenPlum) has a multi-machine
distributed version of PostgreSQL. It's proprietary, though.
www.greenplum.com.

If you have more time than money, I understand that Stanford is working on
this problem:
http://www-db.stanford.edu/~bawa/

But, overall, some people on this list are very mistaken in thinking it's an
easy problem. GP has devoted something like 5 engineers for 3 years to
develop their system. Oracle spent over $100 million to develop RAC.

> but PG  
> just has a bunch of third party extensions, I wonder why these are  
> not being integrated into the main trunk :/

Because it represents a host of complex functionality which is not applicable
to most users? Because there are 4 types of replication and 3 kinds of
clusering and not all users want the same kind?

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-05-10 18:07:00 Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Previous Message Greg Stark 2005-05-10 17:35:59 Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL