Re: extracting location info from string

From: Andrej <andrej(dot)groups(at)gmail(dot)com>
To: Tarlika Elisabeth Schmitz <postgresql3(at)numerixtechnology(dot)de>
Cc: sql pgsql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: extracting location info from string
Date: 2011-05-23 01:11:24
Message-ID: BANLkTikGzsrvXprLXeN3TxZwA=M8fzkn=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 23 May 2011 10:00, Tarlika Elisabeth Schmitz
<postgresql3(at)numerixtechnology(dot)de> wrote:
> On Sun, 22 May 2011 21:05:26 +0100
> Tarlika Elisabeth Schmitz <postgresql3(at)numerixtechnology(dot)de> wrote:
>
>>A column contains location information, which may contain any of the
>>following:
>>
>>1) null
>>2) country name (e.g. "France")
>>3) city name, region name (e.g. "Bonn, Nordrhein-Westfalen")
>>4) city name, Rg. region name (e.g. "Frankfurt, Rg. Hessen")
>>5) city name, Rg region name (e.g. "Frankfurt, Rg Hessen")
>
>
> I also need to cope with variations of COUNTRY.NAME and REGION.NAME.

I'm not sure I fully understand your request, but sanitising that data will be
tedious, whichever way you dice it - particularly with the variations
on region.

Another thing of great import is whether the city can occur in the
data column all by itself; if yes, it's next to impossible to distinguish
it from a country. Specially if we assume that typos/misspelling are
feasible on top of punctuation ... of course, you could create a list
of valid cities and countries, with homophones thrown in for good
measure, and compare & replace things appropriately, w/ name
w/o a clean match being reported for human interaction =o)

If I had a task like that to perform I'd dump the data out to file
and have a good go at it w/ sed & awk, or perl, depending on
how complex & voluminous the data is.

Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.georgedillon.com/web/html_email_is_evil.shtml

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Craig Ringer 2011-05-23 01:42:31 Re: extracting location info from string
Previous Message Tarlika Elisabeth Schmitz 2011-05-22 22:00:41 Re: extracting location info from string