Re: A question multibye

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: safamack(at)hotmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: A question multibye
Date: 2001-07-08 08:38:23
Message-ID: 20010708173823B.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: "Siamack Jabbarzadeh" <safamack(at)hotmail(dot)com>
Subject: A question multibye
Date: Fri, 06 Jul 2001 18:56:53
Message-ID: <F81FKpRcTCi90gP8yJr00010c91(at)hotmail(dot)com>

> Dear Sir/Madam:
> I have some questions on multibye languages and I hope you can help
> me? First I was wondering if there is a table (like ASCII table) for
> multibyte languages?

I am not sure what you want, but PostgreSQL allows default encoding
per database, not per table.

> Second, Assuming we have an input made up of some Japanese letters mixed
> with special character like & and % (which have ASCII values). Now I would
> like to write a parser that takes & and % out and leaves only Japanese
> letters. Knowing the fact that & and % are ASCII and the letters are
> mulitbyte, I can not do the parsing by comparing byte by byte ( as we do in
> normal ASCII). How can I do that? Do % and & have multibye values in
> multibye systems? if yes, how can I get those values? Could you kindly ( if
> you have some solutions to the problem), give me some hints on that?

Japanese has several encodings. I recomend you to use
EUC-JP. (Extended Unix Code for Japanese). With EUC-JP, it's very easy
to distinguish Japanese from ASCII even paring byte by byte. If a
byte is greater than 7f, then it should be a Japanese, otherwise
ASCII.

Anyway, I recommend you to study about Japanese encodings first.

See:
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
--
Tatsuo Ishii

Browse pgsql-general by date

  From Date Subject
Next Message Mithun Bhattacharya 2001-07-08 09:07:27 Re: Why is it not using the other processor?
Previous Message Alex Pilosov 2001-07-08 03:46:52 Re: Why is it not using the other processor?