Re: Primary keys in a single column table and text vs varchar

From: Lew <noone(at)lewscanon(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Primary keys in a single column table and text vs varchar
Date: 2011-02-01 18:59:44
Message-ID: ii9l5r$9ke$1@news.albasani.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

matty jones wrote:
>> I am designing a table that will hold a list of unique names and be
>> referenced by several other tables. There will only be one column in the
>> table (name), should I still create a separate primary key for that column
>> or can I use the name column as my key. I have also been reading a lot
>> about the differences between varchar and text and was wondering if there is
>> a downside to using text. The column I want to use text on (notes) will
>> hold a string that could be any length, I can use varchar(255) but I am
>> worried about the text being truncated or of the queries giving errors when
>> I try to output the text. I understand there is a mathematical limit to how
>> much can be stored in a single entry but I wasn't sure of any downsides to
>> just defaulting to text when possible.

Michael Swierczek wrote:
> It may help you to read pages on natural key versus surrogate key
> discussions, like this page:
> http://www.agiledata.org/essays/keys.html
> To summarize briefly, using the name column as the primary key makes
> your database easier to understand, but if you need to change the
> table in the future and add or remove additional unique constraints,
> changing your database is a lot of work. Using a separate column for
> the primary key gives you the converse situation, so your database
> layout is less intuitive, but changing the tables in the future
> becomes much easier. PostgreSQL will handle either type of primary
> key just fine. In my limited experience a separate key usually is a
> better choice, but I've only been wrestling with databases for a few
> years.
>
> In terms of text versus character varying, I don't think the
> differences are significant. I would use character varying(500) or
> (1000), so you have plenty of room for big names but you can't
> accidentally have a 10MB entry in the table.

For simple one-column lookup tables such as the OP describes, the surrogate
key adds little benefit but relatively larger cost.

The lookup value is supposed to be constrained by the lookup table, and using
a surrogate key to protect against changes in that constraint defeats the
purpose. If you do change it, it means the value acquired for all earlier
transactions was wrong, or at least that you lose the history of what value
was used at that time. You risk semantic changes, e.g., if you change

1 FOO
2 BAR
3 BAZ
4 QUX

to

1 BAR
2 URP
3 BAX
4 QUX
.

You might think you'll protect against that, but the next database copy or
move to a new host via dump/restore could really hose things.

The referencing table also becomes harder to use - EVERY query will need a
join to get at the real value. This will slow down query performance, and a
lookup tables' use is likely to be mostly for queries. Using the natural key
directly means that the FK reference will guarantee a correct value, but you
won't need a join to query that value.

The complexity of managing a second column for a one-column table is likely to
be too high for the limited benefit conferred on a lookup table. Just store
the one natural-key column for a lookup table.

--
Lew
Ceci n'est pas une fenêtre.
.___________.
|###] | [###|
|##/ | *\##|
|#/ * | \#|
|#----|----#|
|| | * ||
|o * | o|
|_____|_____|
|===========|

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Andrej 2011-02-01 20:20:29 Re: ERROR: invalid datatype 'FILE'
Previous Message Thomas Kellerer 2011-02-01 18:49:00 Re: Schema versioning in postgres