Re: Architecting a database

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: tony(at)exquisiteimages(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Architecting a database
Date: 2010-06-26 00:02:51
Message-ID: 4C2543AB.90000@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Kevin Grittner wrote:
> A schema is a logical separation within a database. Table
> client1.account is a different table from client2.account. While a
> user can be limited to tables within a single schema, a user with
> rights to all the tables can join between them as needed. You could
> put common reference data in a public schema which all users could
> access in addition to their private schemas

My guess would be that this app will end up being best split by schema.
I wonder whether it *also* needs to be split by database, too. 2000
clusters is clearly a nightmare, and putting all the client data into
one big table has both performance and security issues; that leaves
database and schema as possible splits. However, having 2000 databases
in a cluster is probably too many; having 2000 schemas in a database
might also be too many. There are downsides to expanding either of
those to such a high quantity.

In order to keep both those in the domain where they perform well and
are managable, it may be that what's needed is, say, 50 databases with
40 schemas each, rather than 2000 of either. Hard to say the ideal
ratio. However, I think that at the application design level, it would
be wise to consider each client as having a database+schema pair unique
to them, and with the assumption some shared data may need to be
replicated to all the databases in the cluster. Then it's possible to
shift the trade-off around as needed once the app is built. Building
that level of flexibility in shouldn't be too hard if it's in the design
from day one, but it would be painful bit of refactoring to do later.
Once there's a prototype, then some benchmark work running that app
could be done to figure out the correct ratio between the two. It might
even make sense to consider full scalability from day one and make the
unique client connection info host:port:database:schema.

P.S. Very refreshing to get asked about this before rather than after a
giant app that doesn't perform well is deployed.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-06-26 01:15:36 Re: WAL+Os on a single disk
Previous Message Craig James 2010-06-25 23:30:55 Re: Architecting a database