Skip site navigation (1) Skip section navigation (2)

Re: Advice needed

From: Lew <noone(at)lewscanon(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Advice needed
Date: 2010-10-30 18:39:44
Message-ID: iahopb$2h9$1@news.albasani.net (view raw or flat)
Thread:
Lists: pgsql-novice
On 10/29/2010 05:27 PM, ALT SHN wrote:
> Hello List,
>
> I'm new to pgsql and to databases in general, and i'm about to implement
> a small db for a non profit organization in Portugal, focused on
> paleontology, and I seek some advices.
> I'll expose my issues regarding relational model theory:
>
> 1 - Is it possible that a primary key and a given attribute are similar
> in the same line of a table? Like this:
>
> * Name (PK) | Category | Filos *
> _ a _          | g          |_ a _
> _ b _          | _b_          |  k

The value can be the same in different columns that draw values from the same 
domain.  It looks like all three of your columns draw from a character domain 
like CHAR, VARCHAR or TEXT.  The interpretation of a common value has no more 
correlation than, for example, the interpretation of the word "driver" in a 
context of computer software, automobile operation or the game of golf.  Same 
value, different purpose in different domains.

To make a table or view of my example, I could track the emissions control 
software type, the golf club in the back seat and the person behind the wheel 
of a particular automobile like

  CREATE TABLE automobile_state
  (
    auto_id IDTYPE FOREIGN KEY REFERENCES auto (auto_id),
    moment  TIMESTAMP,
    passenger_role VARCHAR(6) DEFAULT 'driver',
    controller_type VARCHAR(6) DEFAULT 'driver',
    golf_club      VARCHAR(6) DEFAULT 'driver',
    PRIMARY KEY (auto_id, moment)
  );

>   2 - Relational Database Theory advices you to do normalization at
> least to 3rd normal form. However, if I strictily follow this, I'll end
> up with tens of tables, and all of them will have to be filled from
> scratch. But if I stick to 2nd normal form, I'll end up with only 6

It's the same information to capture regardless of the number of tables. 
Whether you divide 60 attributes as 10 each for 6 tables or 3 each for 20 
tables, you still have to track 60 attributes.

Sacrificing third-normal form ("3NF") costs you a lot more in potential data 
anomalies and geometric growth in query set size.

I tried to make a case a few years back to normalize a database further when I 
saw that one of my client's reports needed a full cross-join between a table 
of size M and one of size N.  A 3NF structure would have reduced the report 
query to linear size growth instead of geometric.  My client wouldn't hear of 
making a DB change, so their complaint became that the report was too slow.

My successor made the normalization change on my advice and was hailed a hero.

> tables wich makes it much easier for users to fill them. So, my question

The complexity seen by the users should have no correlation to the complexity 
of the data model.  You don't give the users a normalized view of the data - 
you give them appropriate denormalized views ("denorms").  This is a matter of 
user interface design, which is driven by the needs of the business domain, 
not by the data model.

> is: given the computational capabilities of today, and given the
> expected size of this DB (10000 records), Wouldn'd be better to let to
> normalization as of 2nd normal form?

No.

Others disagree, e.g., 
<http://www.tekstenuitleg.net/en/articles/software/database-design-tutorial/third-normal-form.html> 
(which I found via the link I provide /infra/).

> BTW, if anyone as good scientif papers on the subject that could share i
> would also appreciate.

<http://lmgtfy.com/?q=database+third-normal+form+advantages>

-- 
Lew

In response to

Responses

pgsql-novice by date

Next:From: LewDate: 2010-10-30 20:50:49
Subject: Re: Advice needed
Previous:From: Chris BrowneDate: 2010-10-29 22:23:43
Subject: Re: Advice needed

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group