Re: DB design advice: lots of small tables?

From: Shaun Thomas <sthomas(at)optionshouse(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: lender <crlender(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: DB design advice: lots of small tables?
Date: 2013-03-15 16:25:38
Message-ID: 51434B82.6090602@optionshouse.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 03/15/2013 08:36 AM, Kevin Grittner wrote:

> I occasionally hear someone maintaining that having a meaningless
> sequential ID column as the primary key of each table is required
> by the relational model.

You know, I've heard you mention this a couple times, and I still don't
understand why you maintain such a position. Artificial the linkages may
be, but unless I'm missing something, the level of abstraction is often
necessary to avoid excessive storage requirements and bulky foreign keys.

For me, it comes down to two things:

1. Copy a "natural" arbitrary-length text string possibly millions of
times, or use a 32/64-bit integer lookup value.
2. Use multiple natural columns in a primary key, necessitating copying
all columns to child tables for foreign keys, or use one surrogate key
for both.

Yes, it complicates the schema. Possibly needlessly so. But until
someone comes up with a database storage method that automatically
deduplicates stored data, I can never advocate using arbitrary strings
as natural keys, no matter how unique and meaningful they are.

Maybe I just say that because I was burned by just such a column in a
previous engagement. It was 64-characters of arbitrary text, and was
used as a lookup value for dozens of tables. Had it been mapped to a
"meaningless" surrogate key, several tables would have been halved (or
more) in size. Don't even get me started on indexing that horrible
monstrosity in every table it lived in.

Small tables? Yeah, whatever. I don't care. But anything that has the
potential to be duplicated millions of times? You better bet that's
going to be a serial identifier.

Now, you've been doing this longer than I have, in a wider array of
contexts, so you clearly have some perspective I don't. From where I'm
sitting though, I don't get the barely suppressed rage. ;)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas(at)optionshouse(dot)com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bob Brown 2013-03-15 18:01:59 link
Previous Message lender 2013-03-15 16:09:59 Re: DB design advice: lots of small tables?