Re: PostgreSQL clustering VS MySQL clustering

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL clustering VS MySQL clustering
Date: 2005-01-23 05:58:28
Message-ID: m3sm4szny3.fsf@knuth.knuth.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

After a long battle with technology, herve(at)elma(dot)fr (Hervé Piedvache), an earthling, wrote:
> Joshua,
>
> Le Jeudi 20 Janvier 2005 15:44, Joshua D. Drake a écrit :
>> Hervé Piedvache wrote:
>> >
>> >My company, which I actually represent, is a fervent user of PostgreSQL.
>> >We used to make all our applications using PostgreSQL for more than 5
>> > years. We usually do classical client/server applications under Linux,
>> > and Web interface (php, perl, C/C++). We used to manage also public web
>> > services with 10/15 millions records and up to 8 millions pages view by
>> > month.
>>
>> Depending on your needs either:
>>
>> Slony: www.slony.info
>>
>> or
>>
>> Replicator: www.commandprompt.com
>>
>> Will both do what you want. Replicator is easier to setup but
>> Slony is free.
>
> No ... as I have said ... how I'll manage a database getting a table
> of may be 250 000 000 records ? I'll need incredible servers ... to
> get quick access or index reading ... no ?
>
> So what we would like to get is a pool of small servers able to make
> one virtual server ... for that is called a Cluster ... no ?

The term "cluster" simply indicates the use of multiple servers.

There are numerous _DIFFERENT_ forms of "clusters," so that for
someone to say "I want a cluster" commonly implies that since they
didn't realize the need to specify things further, they really don't
know what they want in a usefully identifiable way.

> I know they are not using PostgreSQL ... but how a company like
> Google do to get an incredible database in size and so quick access
> ?

Google has built a specialized application that evidently falls into
the category known as "embarrassingly parallel."
<http://c2.com/cgi/wiki?EmbarrassinglyParallel>

There are classes of applications that are amenable to
parallelization.

Those tend to be applications completely different from those
implemented atop transactional data stores like PostgreSQL.

If your problem is "embarrassingly parallel," then I'd bet lunch that
PostgreSQL (and all other SQL databases) are exactly the _wrong_ tool
for implementing its data store.

If your problem is _not_ "embarrassingly parallel," then you'll almost
certainly discover that the cheapest way to make it fast involves
fitting all the data onto _one_ computer so that you do not have to
pay the costs of transmitting data over slow inter-computer
communications links.
--
let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/
It isn't that physicists enjoy physics more than they enjoy sex, its
that they enjoy sex more when they are thinking of physics.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Christopher Browne 2005-01-23 06:08:26 Re: PostgreSQL clustering VS MySQL clustering
Previous Message Christopher Browne 2005-01-23 05:46:51 Re: PostgreSQL clustering VS MySQL clustering