From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | cnliou(at)eurosport(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Rep:Re: [BUGS] Encoding Problem? |
Date: | 2002-03-05 15:01:27 |
Message-ID: | 20020306000127E.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> I guess you are inserting correct EUC Traditional
> Chinese (EUC-TW)
> characters but hard to tell what is happening unless
> you are showing
> us the character sequences in hexa decimal format.
> --
> Tatsuo Ishii
> ===============================
> Many thanks! Tatsuo,
>
> Please see below. Best Regards,
>
> CN
> ---------------
> linux:~$ cat /tmp/tt
> 1111
> ¦¨¥\
> ³\
> 2222
> linux:~$ od -t x /tmp/tt
> 0000000 31313131 a5a8a60a 5cb30a5c 3232320a
> 0000020 00000a32
> 0000022
Are you sure that they are EUC-TW? Considering the byte swapping, they
are actually like this:
0x31,0x31,0x31,0x31,0x0a,
0xa6,0xa8,0xa5,0x5c,0x0a,
0xb3,0x5c,0x0a,
0x32,0x32,0x32,0x32,0x0a
Here we see a55c and b35c, which should never happen in EUC-TW, since
the each second byte is lower than 0x80.
I guess they are BIG5. If my guess is correct, you could set the
client encoding to BIG5 ("\encoding BIG5" in psql) and get correct
result.
--
Tatsuo Ishii
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2002-03-05 16:17:16 | Re: [PATCHES] WITH DELIMITERS in COPY |
Previous Message | Fernando Nasser | 2002-03-05 14:36:01 | Re: Reverting SET SESSION AUTHORIZATION command |