Re: BUG #1775: UTF8 to ISO does not convert the german

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: marcus(at)raphelt(dot)de
Cc: peter_e(at)gmx(dot)net, mha(at)sollentuna(dot)net, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #1775: UTF8 to ISO does not convert the german
Date: 2005-07-20 08:27:26
Message-ID: 20050720.172726.68162903.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> Hi Tatsuo,
>
> Am Mittwoch, 20. Juli 2005 um 01:00 schrieben Sie:
>
> TI> conversion tables. So if german umlauts are converted fine, there's no
> TI> reason the conversion for german sharp s does not work.
>
> TI> Marcus,
>
> TI> Can you give me the exact error message from PostgreSQL when the
> TI> conversio failed?
>
> Well - actually, there IS no error message, convert() just returns an
> empty sting. In detail, here's what I did:
>
> -Set up PostgreSQL (Ascii)
>
> -Imported the opengeodb
> http://sourceforge.net/project/showfiles.php?group_id=132421
> As they offer a PostgreSQL dump, I chose this one.
>
> -Imported the dump using pgAdminIII
>
> -Created a view that returns all german cities with ZIP and
> opengeodb-locationID:
>
> CREATE OR REPLACE VIEW orte_de AS
> SELECT code.text_val AS plz, code.loc_id, town.text_val AS ort
> FROM geodb_hierarchies hi, geodb_textdata state, geodb_textdata town, geodb_textdata code
> WHERE hi.id_lvl2 = state.loc_id AND state.text_val = 'DE'::text
> AND state.text_type = 500100001 AND town.loc_id = hi.loc_id
> AND town.text_type = 500100000 AND code.loc_id = town.loc_id
> AND code.text_type = 500300000;
>
> So now I've got three columns: "plz" (zip), "ort" (city) and loc_id.
> Assuming I want to retrieve cites in the Hamburg / Hannover area...
> (perfect for this task, as they have pretty strange city names there
> :)
>
> SELECT plz, loc_id, ort, convert(ort using utf_8_to_iso_8859_1) as
> ort_conv
> from orte_de
> where plz between 20000 and 30000
> order by ort_conv
>
> This query returns empty values for "ort_conv" if "ort" contains a
> sharp s.
> Btw, it seems like it is the same for "" (&Auml;), have a look at
> loc_id 25182.

I see no problem with encoding conversion itself:

utf8=# \encoding latin1
utf8=# select * from t1;
t
----

(1 row)

> SELECT plz, loc_id, ort, convert(ort using utf_8_to_iso_8859_1) as
> ort_conv
> from orte_de
> where plz between 20000 and 30000
> order by ort_conv

Problem here is the result of convert(ort using utf_8_to_iso_8859_1)
is ISO-8859-1 but your database encoding is UTF-8, so the terminal
treats ISO-8859-1 chars as UTF-8 which will result in unexpected
characters displayed.

I guess what you want to do is:

SELECT plz, loc_id, ort from orte_de
where plz between 20000 and 30000
order by convert(ort using utf_8_to_iso_8859_1)
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Marcus Raphelt 2005-07-20 09:50:49 Re: BUG #1775: UTF8 to ISO does not convert the german "sharp s" (&szlig; )
Previous Message Marcus Raphelt 2005-07-20 07:17:36 Re: BUG #1775: UTF8 to ISO does not convert the german "sharp s" (&szlig; )