Re: BUG #5944: COPY FROM doesn't work with international characters

From: "Nathan M(dot) Davalos" <n(dot)davalos(at)sharedmarketing(dot)com>
To: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5944: COPY FROM doesn't work with international characters
Date: 2011-03-24 00:33:36
Message-ID: 2701CF596B80DC44815FDBFFF5881A1E0104C32F@exchange01.sharedmarketing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

32333030303209416C746F20446573656D7065F16F2C20532E412E20446520432E562E0D0A
The character in question is F1

-----Original Message-----
From: John R Pierce [mailto:pierce(at)hogranch(dot)com]
Sent: Wednesday, March 23, 2011 6:49 PM
To: Nathan M. Davalos
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #5944: COPY FROM doesn't work with international characters

On 03/23/11 4:32 PM, Nathan Davalos wrote:
> ...
> SET CLIENT_ENCODING TO 'WIN1251';
> copy tmpintermediate from 'thefile.txt';
>
>
> Sample contents of thefile:
> 230002 Alto Desempeño, S.A. De C.V.
>
> When using WIN1251 or WIN1252 I get nothing in the second field, it just
> ignores the data. Same thing for LATIN-1.
>
> When using UTF8 for client encoding I get this message:
> ERROR: invalid byte sequence for encoding "UTF8": 0xf16f2c20
> CONTEXT: COPY tmpintermediate , line 1

what is the byte (binary) encoding of the file? in hex,

ñ in win1251 == (no such character. win1251 is cyrillic)
ñ in win1252 == F1
ñ in UTF-8 == C3 B1

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message John R Pierce 2011-03-24 00:54:12 Re: BUG #5944: COPY FROM doesn't work with international characters
Previous Message Tom Lane 2011-03-24 00:07:53 Re: Index Ignored Due To Use Of View