Quick Links

Re: One large v. many small

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Noah Silverman <noah(at)allresearch(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: One large v. many small
Date:	2003-01-30 17:56:56
Message-ID:	200301300956.56041.josh@agliodbs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-performance

Noah,

> As we continue our evaluation of Postgres, another interesting topic
> has come up that I want to run by the group.
>
> In our current model, we have about 3,000 small tables that we use
> track data for our clients. Each table is an identical structure, and
> holds the data for one client.

I'd list what's wrong with this structure, but frankly it would take me long
enough that I'd need a consulting fee. Suffice it to say that the above is
a very, very bad (or at least antiquated) design idea and you need to
transition out of it as soon as possible.

> Another idea that we are considering is one big table instead of 3,000
> smaller ones. We could simply add a numeric field to indicate which
> client a particular record was for.

Yes. Absolutely. Although I'd suggest an Integer field.

> Each table has between 500 and 50,000 records, so the big table could
> have up to 10 million rows if we combined everything.

Sure.

> A query on our current system is (for client #4)
>
> Select (*) from client_4 where foo=2;
>
> A query from the new, proposed system would be
>
> Select (*) from big_results where client=4 and foo=2.
>
> The big questions is, WHICH WILL BE FASTER with Postgres. Is there any
> performance improvement or cost to switching to this new structure.

Oh, no question query 1 will be faster ... FOR THAT QUERY. You are asking the
wrong question.

However, explain to me how, under the current system, you can find the client
who ordered $3000 worth of widgets on January 12th if you don't already know
who it is? I'm not sure a 3000-table UNION query is even *possible*.

Or how about giving me the average number of customer transactions in a month,
across all clients?

<rant>

You've enslaved your application design to performance considerations ... an
approach which was valid in 1990, because processing power was so limited
then. But now that dual-processor servers with RAID can be had for less than
$3000, there's simply no excuse for violating the principles of good
relational database design just to speed up a query. Buying more RAM is
much cheaper than having an engineer spend 3 weeks fixing data integrity
problems.

The proper way to go about application design is to build your application on
paper or in a modelling program according to the best principles of software
design available, and *then* to discuss performance issues -- addressing them
*first* by buying hardware, and only compromising your applcation design when
no other alternative is available.

</rant>

I strongly suggest that you purchase Pascal's "Practical Issues in Database
Design" and give it a read.

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

One large v. many small at 2003-01-30 17:34:36 from Noah Silverman

Responses

Re: One large v. many small at 2003-01-30 19:13:38 from Jeff
Re: One large v. many small at 2003-01-31 03:54:26 from Curt Sampson

Browse pgsql-general by date

	From	Date	Subject
Next Message	Ron St.Pierre	2003-01-30 17:59:17	Re: Documentation needs significant improvement
Previous Message	will trillich	2003-01-30 17:36:40	Re: Documentation needs significant improvement

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Andrew Sullivan	2003-01-30 18:02:40	Re: One large v. many small
Previous Message	Noah Silverman	2003-01-30 17:34:36	One large v. many small