Re: Best practices for geo-spatial city name searches?

From: Steve Midgley <science(at)misuse(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Best practices for geo-spatial city name searches?
Date: 2009-02-25 18:42:49
Message-ID: 20090225184316.09F9E633204@mail.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

At 08:20 AM 2/25/2009, pgsql-sql-owner(at)postgresql(dot)org wrote:
>To: pgsql-sql(at)postgresql(dot)org
>From: Mark Stosberg <mark(at)summersault(dot)com>
>Subject: Best practices for geo-spatial city name searches?
>Date: Tue, 24 Feb 2009 11:19:56 -0500
>Message-ID: <20090224111956(dot)5b7a4301(at)summersault(dot)com>
>X-Archive-Number: 200902/94
>X-Sequence-Number: 32231
>
>Hello,
>
>I use PostgreSQL and the "cube" type to perform geo-spatial zipcode
>proximity
>searches. I'm wondering about the best practices also supporting a
>geo-spatial
>distance search based on a city name rather than zipcode.
>
>In our original data model, we used a 'zipcodes' table, with the
>zipcode as the
>primary key. This can of course contain a "City Name" column, but
>there is a
>problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode
>is 99762.
>It maps to multiple cities including Diomede, Alaska and Nome, Alaska.
>
>In the data model described, only the "Diomede" row is imported, and
>the other
>rows, including the "Nome, Alaska" row are dropped. So if you try to
>search
>for Nome, Alaska, you won't find anything.
>
>One solution would be to have a "cities" table, with the city/state as
>the
>primary key, and a zipcode as an additional column. Then, by joining
>on the
>zipcodes table, the coordinates for a city could be found.
>
>Is there any other way I should be considering data modelling to
>support
>searches on zipcodes and cities?

Hi Mark,

I built a very similar system for www.hutz.com. It uses a complete
postcode database, without dropping nearby/overlapping cities. It also
includes the postcode "alias" values, which are names that the post
office uses as "equivalent" to the official names.

Within the city table, I created a series of self-joining id's:

id|alias_city_id|post_code_city_id|muni_city_id

So a city record can be an "alias", "postcode" or "muni" record. A muni
record is the definitive record for a city (and is defined by the
postcode record closest to the city center as defined by the USGS). A
postcode record, represents a postcode (zipcode) region within a city.
An alias represents an alternate name that either refers to a muni
record or a postcode record (and is defined as "alias_city_id IS NOT
NULL")

So if I want to search the table for only muni city records, the query
looks like

select * from city where id = muni_city_id

I also included lat/long coordinates for every record, making it easy
to calculate distances and find all city records within a certain
range, etc. (I used the "point" and "circle" operators for this along
with a GiST index - it's not perfect for long distances - it assumes
the earth is flat, but it works great for small distances and is very
fast).

I hope this helps. Feel free to contact me on-list or off, if you want
to discuss more.

Steve

Browse pgsql-sql by date

  From Date Subject
Next Message Ivan Sergio Borgonovo 2009-02-27 11:56:06 row not deleted but updated (trigger?)
Previous Message A. Kretschmer 2009-02-25 06:33:59 Re: Add column by using SELECT statement