Re: Support UTF-8 files with BOM in COPY FROM

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support UTF-8 files with BOM in COPY FROM
Date: 2011-09-26 11:47:54
Message-ID: CABUevEwNSAT28h8wN76A3q2edBKoBYU=ms-zi+1bzKGRS4aO0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 26, 2011 at 13:36, Itagaki Takahiro
<itagaki(dot)takahiro(at)gmail(dot)com> wrote:
> On Mon, Sep 26, 2011 at 20:12, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> I like it in general. But if we're looking at the BOM, shouldn't we
>> also look and *reject* the file if it's a BOM for a non-UTF8 file? Say
>> if the BOM claims it's UTF16?
>
> -1 because we're depending on manual configuration for now.
> It would be reasonable if we had used automatic detection of
> character encoding, but we don't. In addition, some crazy
> encoding might use BOM codes as a valid character.

Does such an encoding really exist? And the code only executes when
the user thinks he's in UTF8, right? So it would still only happen if
the incorrect encoding was specified..

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2011-09-26 12:06:10 Re: Support UTF-8 files with BOM in COPY FROM
Previous Message Itagaki Takahiro 2011-09-26 11:36:11 Re: Support UTF-8 files with BOM in COPY FROM