Skip site navigation (1) Skip section navigation (2)

Re: BUG #1775: UTF8 to ISO does not convert the german

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: marcus(at)raphelt(dot)de
Cc: peter_e(at)gmx(dot)net, mha(at)sollentuna(dot)net, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #1775: UTF8 to ISO does not convert the german
Date: 2005-07-20 08:27:26
Message-ID: 20050720.172726.68162903.t-ishii@sra.co.jp (view raw or flat)
Thread:
Lists: pgsql-bugs
> Hi Tatsuo,
> 
> Am Mittwoch, 20. Juli 2005 um 01:00 schrieben Sie:
> 
> TI> conversion tables. So if german umlauts are converted fine, there's no
> TI> reason the conversion for german sharp s does not work.
> 
> TI> Marcus,
> 
> TI> Can you give me the exact error message from PostgreSQL when the
> TI> conversio failed?
> 
> Well - actually, there IS no error message, convert() just returns an
> empty sting. In detail, here's what I did:
> 
> -Set up PostgreSQL (Ascii)
> 
> -Imported the opengeodb
> http://sourceforge.net/project/showfiles.php?group_id=132421
> As they offer a PostgreSQL dump, I chose this one.
> 
> -Imported the dump using pgAdminIII
> 
> -Created a view that returns all german cities with ZIP and
> opengeodb-locationID:
> 
> CREATE OR REPLACE VIEW orte_de AS
> SELECT code.text_val AS plz, code.loc_id, town.text_val AS ort
> FROM geodb_hierarchies hi, geodb_textdata state, geodb_textdata town, geodb_textdata code
> WHERE hi.id_lvl2 = state.loc_id AND state.text_val = 'DE'::text
> AND state.text_type = 500100001 AND town.loc_id = hi.loc_id
> AND town.text_type = 500100000 AND code.loc_id = town.loc_id
> AND code.text_type = 500300000;
> 
> So now I've got three columns: "plz" (zip), "ort" (city) and loc_id.
> Assuming I want to retrieve cites in the Hamburg / Hannover area...
> (perfect for this task, as they have pretty strange city names there
> :)
> 
> SELECT plz, loc_id, ort, convert(ort using utf_8_to_iso_8859_1) as
> ort_conv
> from orte_de
> where plz between 20000 and 30000
> order by ort_conv
> 
> This query returns empty values for "ort_conv" if "ort" contains a
> sharp s.
> Btw, it seems like it is the same for "" (&Auml;), have a look at
> loc_id 25182.

I see no problem with encoding conversion itself:

utf8=# \encoding latin1
utf8=# select * from t1;
 t  
----
 籖
(1 row)

> SELECT plz, loc_id, ort, convert(ort using utf_8_to_iso_8859_1) as
> ort_conv
> from orte_de
> where plz between 20000 and 30000
> order by ort_conv

Problem here is the result of convert(ort using utf_8_to_iso_8859_1)
is ISO-8859-1 but your database encoding is UTF-8, so the terminal
treats ISO-8859-1 chars as UTF-8 which will result in unexpected
characters displayed.

I guess what you want to do is:

SELECT plz, loc_id, ort from orte_de
where plz between 20000 and 30000
order by convert(ort using utf_8_to_iso_8859_1)
--
Tatsuo Ishii

In response to

Responses

pgsql-bugs by date

Next:From: Marcus RapheltDate: 2005-07-20 09:50:49
Subject: Re: BUG #1775: UTF8 to ISO does not convert the german "sharp s" (&szlig;)
Previous:From: Marcus RapheltDate: 2005-07-20 07:17:36
Subject: Re: BUG #1775: UTF8 to ISO does not convert the german "sharp s" (&szlig;)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group