Re: Second byte of multibyte characters causing trouble

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: k-ellrick(at)sctech(dot)co(dot)jp
Cc: dave(at)skiddlydee(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: Second byte of multibyte characters causing trouble
Date: 2001-09-22 10:48:00
Message-ID: 20010922194800Y.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Now first I have to convert my existing data, which although sitting in a
> database that expects EUC, is actually SJIS-based text. I found the
> following series of bash commands in a Japanese mailing list archive - does
> it look like this will work for me? (It looks scary to just drop the whole
> database and hope that the .out file knows how to rebuild it with all the
> indexes, sequences, users, etc. in place - should I be nervous?)
> $ pg_dump -D dbname > db.out
> $ dropdb dbname
> $ createdb -E EUC_JP dbname
> $ export PGCLIENTENCODING=SJIS
> $ psql dbname < db.out
> $ export PGCLIENTENCODING=EUC_JP

Yes, above procedure should convert your SJIS based database (by
mistake) to EUC_JP database.

> Regarding the user interface end, when I read the suggested solution of
> using jcode to convert everything in and out of the database, I thought,
> "That's tedious! Why not just use EUC on the web pages, and the whole
> system will be in sync?" But that seems to be almost as tedious. The
> Windows-based editor I normally use to input the Japanese text portions of
> my code (I do most of the work in vi on my Linux box, but I can't input the
> Japanese that way)

You can't input Japanese using vi? Why?

> reads and writes in Shift-JIS unless I use pre- and
> post-processing filters, and it seems that other Windows programs also favor
> Shift-JIS.

Why not emacs? It can read and write SJIS texts directory.

> I did a totally unofficial, very-small-data-sample survey of
> Japanese web sites, and it seems that in general, sites that deal with
> ordinary consumers (and likely are written on Microsoft machines) use
> Shift-JIS (even ones that I figure must use databases, like search engines
> and e-commerce), Linux-related sites use JIS, and PostgreSQL-related sites
> use EUC. I'm sure there's a grand story to explain how it got to be this
> messy, but for right now, I guess we have to live with all these different
> systems - apparently there is not one system that works nicely for all
> things, or else the others would gradually become obselete, right?
>
> Before I add jcode function calls for every piece of data I get in or out of
> the database, or convert all my web page text to EUC-JP (I haven't decided
> yet which approach is more work, or more of a problem to maintain), are
> there any other thoughts on this? For example, does someone know of one of
> the following: (a) a way to get the text-only console of a RedHat 6.1J box
> to actually display Japanese characters (if so, I not only wouldn't have to
> deal with the Windows box for editing, I could even read the output of
> queries in psql!),

Use "kon" command.

> or (b) a text editor for Windows that can be configured
> to default to EUC, rather than having to remember to always select a filter
> to convert to and from Shift-JIS?

Again why not emacs?

> Or on the flip side of the discussion,
> can anyone imagine pitfalls associated with having a web site that is half
> EUC (the PHP and Perl files that deal with the database) and half Shift-JIS
> (the static HTML pages that are written by other people in who-knows-what
> Windows-based tools)?

Are yo using PHP? Then I strongly recommend upgrade to PHP 4.0.6 or
higher. It supports Japanese very well. It aumatically guess the input
charset, does the neccessary conversion. This is very helpfull. Also
I recommend that you always use EUC-JP to write PHP scripts.

Assuming you could read/write Japanese, I recommend you subscribe
PHP-users list (http://ns1.php.gr.jp/mailman/listinfo/php-users).
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris 2001-09-22 12:43:26 creating "user" table
Previous Message David Santinoli 2001-09-22 10:38:40 Are duplicated OIDs troublesome?