Skip site navigation (1) Skip section navigation (2)

Re: Architecting a database

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: tony(at)exquisiteimages(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Architecting a database
Date: 2010-06-26 00:02:51
Message-ID: 4C2543AB.90000@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-performance
Kevin Grittner wrote:
> A schema is a logical separation within a database.  Table
> client1.account is a different table from client2.account.  While a
> user can be limited to tables within a single schema, a user with
> rights to all the tables can join between them as needed.  You could
> put common reference data in a public schema which all users could
> access in addition to their private schemas

My guess would be that this app will end up being best split by schema.  
I wonder whether it *also* needs to be split by database, too.  2000 
clusters is clearly a nightmare, and putting all the client data into 
one big table has both performance and security issues; that leaves 
database and schema as possible splits.  However, having 2000 databases 
in a cluster is probably too many; having 2000 schemas in a database 
might also be too many.  There are downsides to expanding either of 
those to such a high quantity.

In order to keep both those in the domain where they perform well and 
are managable, it may be that what's needed is, say, 50 databases with 
40 schemas each, rather than 2000 of either.  Hard to say the ideal 
ratio.  However, I think that at the application design level, it would 
be wise to consider each client as having a database+schema pair unique 
to them, and with the assumption some shared data may need to be 
replicated to all the databases in the cluster.  Then it's possible to 
shift the trade-off around as needed once the app is built.  Building 
that level of flexibility in shouldn't be too hard if it's in the design 
from day one, but it would be painful bit of refactoring to do later.  
Once there's a prototype, then some benchmark work running that app 
could be done to figure out the correct ratio between the two.  It might 
even make sense to consider full scalability from day one and make the 
unique client connection info host:port:database:schema.

P.S. Very refreshing to get asked about this before rather than after a 
giant app that doesn't perform well is deployed.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com   www.2ndQuadrant.us


In response to

Responses

pgsql-performance by date

Next:From: Robert HaasDate: 2010-06-26 01:15:36
Subject: Re: WAL+Os on a single disk
Previous:From: Craig JamesDate: 2010-06-25 23:30:55
Subject: Re: Architecting a database

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group