Skip site navigation (1) Skip section navigation (2)

Re: A Patch for MIC to EUC_TW code converting in mb support

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: cch(at)cc(dot)kmu(dot)edu(dot)tw, pgsql-patches(at)postgresql(dot)org
Subject: Re: A Patch for MIC to EUC_TW code converting in mb support
Date: 2001-01-22 22:21:37
Message-ID: 200101222221.RAA12775@candle.pha.pa.us (view raw or flat)
Thread:
Lists: pgsql-docspgsql-hackerspgsql-patches
Tatsuo, I assume these are all done in 7.1, right?

> > ============================================================================
> > 
> > POSTGRESQL BUG REPORT: MIC to EUC_TW code converting in mb support
> > ============================================================================
> > 
> > System Configuration
> > ---------------------
> >   Architecture (example: Intel Pentium)         :x86
> >   Operating System (example: Linux 2.0.26 ELF)  :Linux 2.2.x and FreeBSD
> > 3.5R
> >   PostgreSQL version (example: PostgreSQL-7.0)  :PostgreSQL-7.0.2
> >   Compiler used (example:  gcc 2.8.0)           :egcs-2.91.66, gcc 2.7.3
> > 
> > A FULL description of the problem:
> > ------------------------------------------------
> > In PostgreSQL mb (multi-byte) support, there is a bug in code converting
> > 
> > for MIC to EUC_TW. Original mic2euc_tw() in conv.c converts CNS
> > 11643-1992
> > Plane 2 into 2 bytes EUC_TW encoding. But characters in CNS 11643-1992
> > Plane 2
> > should be converted into 4 bytes EUC_TW encoding instead.
> > 
> > A way to repeat the problem:
> > ----------------------------------------------------------------------
> > When you initdb with -E EUC_TW and set PGCLIENTENCODING to BIG5,
> > you will find all the characters in CNS 11643-1992 Plane 2 are
> > incorrectly stored or output.
> > 
> > This problem might be fixed by the solution in the attachement.
> 
> Thanks for pointing it out. Your fix seems correct.
> 
> BTW I have found another bug with EUC_TW support. line 917 in conv.c:
> 
> 			*p++ = c1 - LC_CNS11643_3 + 0xa3;
> 
> this should be:
> 
> 			*p++ = *mic++ - LC_CNS11643_3 + 0xa3;
> 
> Otherwise, CNS 11643-1992 Plane 3 or more won't work. Could you test
> it out with CNS 11643-1992 Plane 3 or more?
> 
> If they are ok, I will fix the current source and make a patch for
> 7.0.3 (I guess it's too late to back-patch the 7.0 tree).
> --
> Tatsuo Ishii
> 


-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman(at)candle(dot)pha(dot)pa(dot)us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

In response to

Responses

pgsql-docs by date

Next:From: Tatsuo IshiiDate: 2001-01-23 01:04:12
Subject: Re: A Patch for MIC to EUC_TW code converting in mb support
Previous:From: Peter EisentrautDate: 2001-01-22 18:01:11
Subject: Re: MS FAQ

pgsql-hackers by date

Next:From: Tom LaneDate: 2001-01-22 22:46:09
Subject: Re: AW: like and optimization
Previous:From: Peter MountDate: 2001-01-22 22:19:00
Subject: Re: FW: Postgresql on win32

pgsql-patches by date

Next:From: Tom LaneDate: 2001-01-22 22:36:58
Subject: Re: Re: BeOS Patch
Previous:From: Bruce MomjianDate: 2001-01-22 22:09:08
Subject: Re: BeOS Patch

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group