8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding

From: Sergey Burladyan <eshkinkot(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: 8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding
Date: 2008-03-17 13:16:33
Message-ID: 200803171616.34193.eshkinkot@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi, all !

I can't convert with convert(bytea, name, name)::bytea from 'iso-8859-5'
to 'windows-1251' or any other cyrillic 8-bit encoding.

seb=> show client_encoding ;
client_encoding
-----------------
UTF8

seb=> show server_encoding;
server_encoding
-----------------
UTF8

seb=> select version();
version
----------------------------------------------------------------------------------------
PostgreSQL 8.3.0 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.3 (Debian
4.2.3-1)

lc_collate | ru_RU.UTF-8
lc_ctype | ru_RU.UTF-8
lc_messages | ru_RU.UTF-8
lc_monetary | ru_RU.UTF-8
lc_numeric | ru_RU.UTF-8
lc_time | ru_RU.UTF-8

seb=> select
convert(convert('абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ', 'utf-8', 'iso-8859-5'), 'iso-8859-5', 'windows-1251');
ERROR: character 0xf1 of encoding "ISO_8859_5" has no equivalent
in "MULE_INTERNAL"

At first - i am convert my console locale encoding (ru_RU.UTF-8) to iso-8859-5
(cyrillic 8-bit character encoding) and second convert is for show problem.

windows-1251 - is other cyrillic 8-bit character encoding, convert to koi8-r
also not work.

i am write output of convert(..., 'utf-8', 'iso-8859-5') into file and read it
with: iconv -f iso-8859-5 -- all chars readed ok. (see progs in attach)

convert(..., 'iso-8859-5', 'utf-8') looking good, i am check it like this:
seb=> set standard_conforming_strings TO on; --- do not escape bytea
SET
seb=> select
convert('\320\321\322\323\324\325\361\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\260\261\262\263\264\265\241\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317', 'iso-8859-5', 'utf-8');

convert
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

\320\260\320\261\320\262\320\263\320\264\320\265\321\221\320\266\320\267\320\270\320\271\320\272\320\273\320\274\320\275\320\276\320\277\321\200\321\201\321\202\321\203\321\204\321\205\321\206\321\207\321\210\321\211\321\212\321\213\321\214\321\215\321\216\321\217\320\220\320\221\320\222\320\223\320\224\320\225\320\201\320\226\320\227\320\230\320\231\320\232\320\233\320\234\320\235\320\236\320\237\320\240\320\241\320\242\320\243\320\244\320\245\320\246\320\247\320\250\320\251\320\252\320\253\320\254\320\255\320\256\320\257
(1 запись)

seb=> set standard_conforming_strings TO off; --- now we must escaping bytea
for show text
SET
seb=> select
E'\320\260\320\261\320\262\320\263\320\264\320\265\321\221\320\266\320\267\320\270\320\271\320\272\320\273\320\274\320\275\320\276\320\277\321\200\321\201\321\202\321\203\321\204\321\205\321\206\321\207\321\210\321\211\321\212\321\213\321\214\321\215\321\216\321\217\320\220\320\221\320\222\320\223\320\224\320\225\320\201\320\226\320\227\320\230\320\231\320\232\320\233\320\234\320\235\320\236\320\237\320\240\320\241\320\242\320\243\320\244\320\245\320\246\320\247\320\250\320\251\320\252\320\253\320\254\320\255\320\256\320\257';
?column?
--------------------------------------------------------------------
абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
(1 запись)

it os ok.

text string parameter is russian alphabet from first letter to last, lower
case, and from first letter to last, UPPER case

may be i am doing something wrong ?

---

Attachment Content-Type Size
a.cpp text/x-c++src 1.6 KB

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2008-03-17 13:44:27 Re: BUG #4040: psql should provide option to not prompt for password
Previous Message Mika Fischer 2008-03-17 10:19:35 BUG #4040: psql should provide option to not prompt for password