Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)
Date: 2007-08-28 16:09:52
Message-ID: 46D448D0.4000301@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

Bill Moran wrote:
> While true, I feel those applications are the exception, not the rule.
> Most DBs these days are the blogs and the image galleries, etc. And
> those don't need or want the overhead associated with synchronous
> replication.

Uhm.. do blogs and image galleries need replication at all?

I'm thinking more of the business critical applications, where high
availability is a real demand - and where your data *should* better be
distributed among multiple data centers just to avoid a single point of
failure.

<rant> for most other stuff MySQL is good enough </rant>

> I find that line fuzzy.

Yeah, it is.

> It's synchronous for the reason you describe,
> but it's asynchronous because a query that has returned successfully
> is not _guaranteed_ to be committed everywhere yet. Seems like we're
> dealing with a limitation in the terminology :)

Certainly! But sync and async replication are so well known and used
terms... on the other hand, I certainly agree that in Postgres-R, the
nodes do not process transactions synchronously, but asynchronous.

Maybe it's really better to speak of eager and lazy replication, as in
some literature (namely the initial Postgres-R paper of Bettina Kemme).

> This could potentially be a problem on (for example) a web application,
> where a particular user's experience may be load-balanced to another
> node at any time. Of course, you just have to write the application
> with that knowledge.

IMO, such heavily dynamic load-balancing is rarely useful.

With application support, it's easily doable: let the first transaction
on node A query the (global) transaction identifier and after connecting
to the next node B, ask that to wait until that transaction has committed.

It gets a little harder without application support: the load balancer
would have to keep track of sessions and their last (writing) transaction.

Again, thank you for pointing this out.

Regards

Markus

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Teodor Sigaev 2007-08-28 16:22:27 Re: PickSplit method of 2 columns ... error
Previous Message cluster 2007-08-28 15:48:50 Reliable and fast money transaction design