Re: Geographic data sources, queries and questions

From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Oliver Elphick <olly(at)lfix(dot)co(dot)uk>
Cc: btober(at)ct(dot)metrocast(dot)net, pgsql-general(at)postgresql(dot)org
Subject: Re: Geographic data sources, queries and questions
Date: 2007-05-29 22:36:14
Message-ID: F1DA8F87-4B1E-472A-B26D-244F09A4AC5B@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On May 29, 2007, at 14:50 , Oliver Elphick wrote:

> On Tue, 2007-05-29 at 13:49 -0500, Michael Glaesemann wrote:
>>
>> If you're handling more than one country, you'll most likely want to
>> associate the states with their respective countries.
>>
>> -- Listing 4
>> CREATE TABLE countries
>> (
>> country_id INTEGER PRIMARY KEY
>> );
>>
>> CREATE TABLE states
>> (
>> state_id INTEGER PRIMARY KEY
>> , state_name TEXT NOT NULL
>> , country_id INTEGER NOT NULL
>> REFERENCES countries (country_id)
>> );
>>
>> Note that there's no UNIQUE constraint on state_name. You may have
>> more than one state with the same state_name around the world so you
>> may want to make sure that for each country, each state_name is
>> unique:
>
> You have assumed that state codes are unique integers, but for a
> worldwide database that is probably a bad design.

Actually, my intent was to use state_id as a surrogate key for
state_name. I assumed unique state_names per country. If one wanted
state codes, (such as ISO 3166-2), you'd add columns for that.

> Furthermore, these codes are
> not going to be unique. For instance MH is the US abbreviation for
> the
> Marshall Islands [US Post Office] and also the Indian abbreviation for
> Maharashtra [Wikipedia]. In such a case I would always make the
> country
> code part of the primary key and not just an attribute. Again this
> saves your having to invent a new set of codes when one exists
> already.

This trends towards the discussion on whether or not to use surrogate
keys. In the schema I suggested, the natural key is (state_name,
country_id). If you wanted to use ISO 3166-1 codes in both countries
and states tables rather than the country_id surrogate key, that'd
work, too. Surrogate keys are tangential to the normalization issues
I was addressing.

Michael Glaesemann
grzm seespotcode net

In response to

Browse pgsql-general by date

  From Date Subject
Next Message brsaweda 2007-05-29 22:42:53 SELECT <all fields except "bad_field"> from mytbl;
Previous Message Diego Fernando Arce 2007-05-29 22:30:19 Re: monitor stats