From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, david(at)kineticode(dot)com, itagaki(dot)takahiro(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Support UTF-8 files with BOM in COPY FROM |
Date: | 2011-09-26 16:07:26 |
Message-ID: | CA+Tgmoa7SzcuViKfdbmWWeRmzZnjo93AmbhiOHaO9E=330PFow@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>> "David E. Wheeler" <david(at)kineticode(dot)com> <CAJW2+qdYg1+xLaHDqnJs3AcKmCSVCDkv_LCAPWUtwmxL9dzVhQ(at)mail(dot)gmail(dot)com> writes:
>>> On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote:
>>>> I'm thinking about only COPY FROM for reads, but if someone wants to add
>>>> BOM in COPY TO, we might also support COPY TO WITH BOM for writes.
>>
>>> I think it would have to be optional, since "some recipients of UTF-8 encoded data do not expect a BOM."
>>
>> Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
>> that Microsloth does it does not make it standards-conformant.
>>
>> I think that accepting it on input can be sensible, on the principle of
>> "be liberal in what you accept", but the other side of that is "be
>> conservative in what you send". No BOMs in output, please.
>
> Suppose a user uses brain-dead editor, which does not accept UTF-8
> without BOM. He decides to save his editor data into PostgreSQL using
> COPY FROM. He extracts the data using COPY TO. Now he finds that his
> stupid editor does not accept his data any more.
>
> So I think if we decide to accept UTF-8 with BOM, we should keep BOM
> when importing the data and output the data with BOM. If we don't want
> to output UTF-8 with BOM, we should not accept UTF-8 with BOM. It
> seems we don't have much choice...
Maybe this needs to be an optional behavior, controlled by some COPY option.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-09-26 16:12:38 | Re: contrib/sepgsql regression tests are a no-go |
Previous Message | Kevin Grittner | 2011-09-26 16:04:13 | Re: random isolation test failures |