Re: problem converting database to UTF-8

From: Schwaighofer Clemens <clemens(dot)schwaighofer(at)tequila(dot)jp>
To: David Goodenough <david(dot)goodenough(at)btconnect(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: problem converting database to UTF-8
Date: 2009-02-04 05:51:45
Message-ID: fed954960902032151h22d15774v3054245a6dd2997b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jan 23, 2009 at 02:18, David Goodenough
<david(dot)goodenough(at)btconnect(dot)com> wrote:
>
> Is there a definative HOWTO that I can follow, if not does someone
> have a set of instructions that will work?
>
> If it matters I am running under Debian.

I did it once for a very large db (large for me was 5GB) and converted
it from EUC to UTF8.

1) dumped all the data (pg_dump_all) on the source system so it was all EUC
2) split the file into manage able chunks (per LINE not per BYTE if
you work with multibyte things).
3) iconv -f EUC -t UTF8 -c (yes -c because there might be some strange
thing inside and so it doesn't stop)
4) put the files together again
5) sed to replace all EUC to UTF8
6) import into new created db on target system with all set to UTF8

--
[ Clemens Schwaighofer -----=====:::::~ ]
[ IT Engineer/Manager ]
[ E-Graphics Communications, TEQUILA\ Japan IT Group ]
[ 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703 Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.jp ]

Advertising Age Global Agency of the Year 2008
Adweek Global Agency of the Year 2008

This e-mail is intended only for the named person or entity to which
it is addressed and contains valuable business information that is
privileged, confidential and/or otherwise protected from disclosure.
Dissemination, distribution or copying of this e-mail or the
information herein by anyone other than the intended recipient, or
an employee or agent responsible for delivering the message to the
intended recipient, is strictly prohibited. All contents are the
copyright property of TBWA Worldwide, its agencies or a client of
such agencies. If you are not the intended recipient, you are
nevertheless bound to respect the worldwide legal rights of TBWA
Worldwide, its agencies and its clients. We require that unintended
recipients delete the e-mail and destroy all electronic copies in
their system, retaining no copies in any media.If you have received
this e-mail in error, please immediately notify us via e-mail to
disclaimer(at)tbwaworld(dot)com(dot) We appreciate your cooperation.

We make no warranties as to the accuracy or completeness of this
e-mail and accept no liability for its content or use. Any opinions
expressed in this e-mail are those of the author and do not
necessarily reflect the opinions of TBWA Worldwide or any of its
agencies or affiliates.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Simon Riggs 2009-02-04 08:13:30 Re: Pet Peeves?
Previous Message Scott Marlowe 2009-02-04 05:11:45 Re: Pet Peeves?