Re: Advice needed

From: Lew <noone(at)lewscanon(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Advice needed
Date: 2010-10-30 18:39:44
Message-ID: iahopb$2h9$1@news.albasani.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 10/29/2010 05:27 PM, ALT SHN wrote:
> Hello List,
>
> I'm new to pgsql and to databases in general, and i'm about to implement
> a small db for a non profit organization in Portugal, focused on
> paleontology, and I seek some advices.
> I'll expose my issues regarding relational model theory:
>
> 1 - Is it possible that a primary key and a given attribute are similar
> in the same line of a table? Like this:
>
> * Name (PK) | Category | Filos *
> _ a _ | g |_ a _
> _ b _ | _b_ | k

The value can be the same in different columns that draw values from the same
domain. It looks like all three of your columns draw from a character domain
like CHAR, VARCHAR or TEXT. The interpretation of a common value has no more
correlation than, for example, the interpretation of the word "driver" in a
context of computer software, automobile operation or the game of golf. Same
value, different purpose in different domains.

To make a table or view of my example, I could track the emissions control
software type, the golf club in the back seat and the person behind the wheel
of a particular automobile like

CREATE TABLE automobile_state
(
auto_id IDTYPE FOREIGN KEY REFERENCES auto (auto_id),
moment TIMESTAMP,
passenger_role VARCHAR(6) DEFAULT 'driver',
controller_type VARCHAR(6) DEFAULT 'driver',
golf_club VARCHAR(6) DEFAULT 'driver',
PRIMARY KEY (auto_id, moment)
);

> 2 - Relational Database Theory advices you to do normalization at
> least to 3rd normal form. However, if I strictily follow this, I'll end
> up with tens of tables, and all of them will have to be filled from
> scratch. But if I stick to 2nd normal form, I'll end up with only 6

It's the same information to capture regardless of the number of tables.
Whether you divide 60 attributes as 10 each for 6 tables or 3 each for 20
tables, you still have to track 60 attributes.

Sacrificing third-normal form ("3NF") costs you a lot more in potential data
anomalies and geometric growth in query set size.

I tried to make a case a few years back to normalize a database further when I
saw that one of my client's reports needed a full cross-join between a table
of size M and one of size N. A 3NF structure would have reduced the report
query to linear size growth instead of geometric. My client wouldn't hear of
making a DB change, so their complaint became that the report was too slow.

My successor made the normalization change on my advice and was hailed a hero.

> tables wich makes it much easier for users to fill them. So, my question

The complexity seen by the users should have no correlation to the complexity
of the data model. You don't give the users a normalized view of the data -
you give them appropriate denormalized views ("denorms"). This is a matter of
user interface design, which is driven by the needs of the business domain,
not by the data model.

> is: given the computational capabilities of today, and given the
> expected size of this DB (10000 records), Wouldn'd be better to let to
> normalization as of 2nd normal form?

No.

Others disagree, e.g.,
<http://www.tekstenuitleg.net/en/articles/software/database-design-tutorial/third-normal-form.html>
(which I found via the link I provide /infra/).

> BTW, if anyone as good scientif papers on the subject that could share i
> would also appreciate.

<http://lmgtfy.com/?q=database+third-normal+form+advantages>

--
Lew

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Lew 2010-10-30 20:50:49 Re: Advice needed
Previous Message Chris Browne 2010-10-29 22:23:43 Re: Advice needed