idea for a geographically distributed database: how best to implement?

From: Andy Ballingall <andy(at)areyoulocal(dot)co(dot)uk>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: idea for a geographically distributed database: how best to implement?
Date: 2005-11-17 08:44:24
Message-ID: ECOWS04Mp8nkfZyufzT00010fe4@smtp-out4.blueyonder.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hello,

I've got a database for a website which is a variant of the 'show stuff near
to me' sort of thing.

Rather than host this database on a single server, I have a scheme in mind
to break the database up geographically so that each one can run comfortably
on a small server, but I'm not sure about the best way of implementing it.

Here's the scheme:

--------------------------------
Imagine that the country is split into an array of square cells.
Each cell contains a database that stores information about people who live
in the area covered by the cell.

There's one problem with this scheme. What happens if you live near the edge
of a cell?

My solution is that any inserted data which lies near to the edge of cell A
is *also* inserted in the database of the relevant neighbouring cell - let's
say cell B.

Thus, if someone lives in cell B, but close to the border with cell A,
they'll see the data that is geographically close to
them, even if it lies in cell A.

--------------------------------

Is this a common pattern?

I could, of course, simply find every insert, update and delete in the
application and alter the code to explicitly update all the relevant
databases, but is there a more elegant way of simply saying: "Do this
transaction on both Database A and Database B" monotonically?

I've had a look at some replication solutions, but they all seem to involve
replicating an entire database. The advantage of my scheme is that if I can
distribute my application over large numbers of small servers, I'll end up
with more bangs for the buck, and it'll be much easier to manage growth by
managing the number of servers, and number of cells hosted on each server.

Thanks for any suggestions!
Andy Ballingall

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message A. Kretschmer 2005-11-17 09:26:38 Re: Arrya variable as argument to IN expression
Previous Message Emil Kaffeshop 2005-11-17 08:28:51 Arrya variable as argument to IN expression