Quick Links

Re: BUG #4098: Encoding problems

From:	Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To:	Jan-Peter Seifert <Jan-Peter(dot)Seifert(at)gmx(dot)de>
Cc:	pgsql-bugs(at)postgresql(dot)org
Subject:	Re: BUG #4098: Encoding problems
Date:	2008-04-07 17:46:29
Message-ID:	47FA5DF5.6040009@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Jan-Peter Seifert wrote:
> The following bug has been logged online:
>
> Bug reference: 4098
> Logged by: Jan-Peter Seifert
> Email address: Jan-Peter(dot)Seifert(at)gmx(dot)de
> PostgreSQL version: 8.2
> Operating system: Windows xp
> Description: Encoding problems
> Details:
>
> The encoding of the source db/server is LATIN1. The data type of the field
> is text (The storage mode is extended). It's/was possible to add characters
> available in CP1252 but not in LATIN1 like the Euro character (code 80).
> When exporting to UTF8 via "pg_dump -o -U postgres -E UTF-8 ..." (iconv?) it
> just adds the character with the code "C2" before the Euro character in the
> dump.

Yes, but if you do that, PostgreSQL doesn't know that code 0x80 actually
means the Euro character. That's why the conversion to UTF-8 doesn't
work the way you expected.

You should've created the database with WIN1252 encoding instead to
begin with.

I think you can fix that by dumping the database in LATIN1 encoding,
modifying "client_encoding" line in the dump file to 'WIN1252', and
importing it back.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

BUG #4098: Encoding problems at 2008-04-07 16:16:12 from Jan-Peter Seifert

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	felipe macedo	2008-04-07 18:09:39	BUG #4099: backup error
Previous Message	Jan-Peter Seifert	2008-04-07 16:16:12	BUG #4098: Encoding problems