Re: Invalid EUC_TW character sequence found

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: gene(at)regaltronic(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org, gordon(at)gforce(dot)ods(dot)org
Subject: Re: Invalid EUC_TW character sequence found
Date: 2002-06-26 03:42:06
Message-ID: 20020626.124206.102120976.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> To me, the third insert is a character that display correctly in my application,
> I do not see any problem. And I do not know and can not tell how to check that
> 'xx' is not a correct ECU_TW character. Please give me some hint for checking,
> thanks!!

Ok, here are some rules to verify EUC_TW characters:

(1) if the first byte is 0x8e, then the 8th bit of following three
bytes must be set

(2) else if the first byte is 0x8f, then the 8th bit of following two
bytes must be set

(3) else if the 8th bit of the first byte is set, then the 8th bit of
following one bytes must be set

(4) else (that means the 8th bit of the first byte is not set) then
that must be an ASCII character.

Apparently 0xa672 does not satisfy all of above.
--
Tatsuo Ishii

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message pgsql-bugs 2002-06-26 07:33:25 Bug #699: pg_dump not reporting correct start value for sequence
Previous Message Gene Leung 2002-06-26 03:30:19 Re: Invalid EUC_TW character sequence found