Skip site navigation (1) Skip section navigation (2)

Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

From: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: steven(at)trumpet(dot)io, pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
Date: 2010-08-19 23:29:57
Message-ID: 20100820.082957.113300986.t-ishii@sraoss.co.jp (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackers
> We generally assume that in server-safe encodings, the ctype.h functions
> will behave sanely on any single-byte value.

I think this "wisedom" is only true for C locale.  I'm not surprised
all that it does not work with non C locales.

>From array_funcs.c:

		while (isspace((unsigned char) *p))
			p++;

IMO this should be something like:

		while (isspace((unsigned char) *p))
			p += pg_mblen(p);
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

In response to

Responses

pgsql-hackers by date

Next:From: David FetterDate: 2010-08-19 23:37:49
Subject: Re: proposal: tuplestore, tuplesort aggregate functions
Previous:From: Tom LaneDate: 2010-08-19 23:06:14
Subject: Re: trace_recovery_messages

pgsql-bugs by date

Next:From: Robert HaasDate: 2010-08-20 01:43:02
Subject: Re: BUG #5305: Postgres service stops when closing Windows session
Previous:From: Steven SchlanskerDate: 2010-08-19 22:54:36
Subject: Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group