Re: UNICODE

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: jklcom(at)mindspring(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UNICODE
Date: 2001-10-30 01:01:39
Message-ID: 20011030100139C.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Can you please do not send me a personal mail?
Let's share info among people in the mailing list.
Anyway...

> I've tried that. Still not writing the Chinese characters correctly.

I don't know what kind of Chinese character set you are using, but at
least your code will not work if the Chinese character set is Big5
since the second byte of it contains ascii characters.
To learn more about character sets, see
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
for example.
--
Tatsuo Ishii

> Here is the code:
>
> contentTypeFromPost = getenv("CONTENT_TYPE");
> contentTypeLength = getenv("CONTENT_LENGTH");
> icontentLength = atoi(contentTypeLength);
>
> if((queryString = malloc(icontentLength + 1)) == NULL)
> {
> postMessage("Cannot allocate memory", 0);
> return(0);
> }
> for(i=0; *queryString; i++)
> {
> splitword(items.Item, queryString, '&');
> unescape_url(items.Item);
> splitword(items.name, items.Item, '=');
>
> // items.Item contains double byte characters
> // However, when write to database I get unrecognizable data
> }
>
> void splitword(uchar *out, uchar *in, uchar stop)
> {
> int i, j;
>
> while(*in == ' ') in++; /* skip past any spaces */
>
> for(i = 0; in[i] && (in[i] != stop); i++)
> out[i] = in[i];
>
> out[i] = '\0'; /* terminate it */
> if(in[i]) ++i; /* position past the stop */
>
> while(in[i] == ' ') i++; /* skip past any spaces */
>
> for(j = 0; in[j]; ) /* shift the rest of the in */
> in[j++] = in[i++];
> }
>
> uchar x2c(uchar *x)
> {
> register uchar c;
>
> /* note: (x & 0xdf) makes x upper case */
> c = (x[0] >= 'A' ? ((x[0] & 0xdf) - 'A') + 10 : (x[0] - '0'));
> c *= 16;
> c += (x[1] >= 'A' ? ((x[1] & 0xdf) - 'A') + 10 : (x[1] - '0'));
> return(c);
> }
>
> void unescape_url(uchar *url)
> {
> register int i, j;
>
> for(i = 0, j = 0; url[j]; ++i, ++j)
> {
> if((url[i] = url[j]) == '%')
> {
> url[i] = x2c(&url[j + 1]);
> j += 2;
> }
> else if (url[i] == '+')
> url[i] = ' ';
> }
> url[i] = '\0'; /* terminate it at the new length */
> }
>
> -----Original Message-----
> From: Tatsuo Ishii [mailto:t-ishii(at)sra(dot)co(dot)jp]
> Sent: Sunday, October 28, 2001 4:57 PM
> To: jklcom(at)mindspring(dot)com
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: RE: [GENERAL] UNICODE
>
>
> > I'm also trying to write some Chinese data to postgresql database. I'm
> > gibberish after it's written to the database.
> >
> > I recognize the problem is at the http request. How do I retrieve double
> > byte characters through http request using C/C++? And how do I write it
> the
> > database?
>
> Nothing special. Just read/write one by one.
>
> > And how do I tell it what kind of encoding to use?
>
> set client_encoding.
> --
> Tatsuo Ishii
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Mayan 2001-10-30 02:00:28 PostgreSQL dirver?
Previous Message Timothy H. Keitt 2001-10-29 23:44:53 Re: Differential Backups