Support UTF-8 files with BOM in COPY FROM

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Support UTF-8 files with BOM in COPY FROM
Date: 2011-09-26 04:58:42
Message-ID: CAJW2+qdYg1+xLaHDqnJs3AcKmCSVCDkv_LCAPWUtwmxL9dzVhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
in COPY FROM command. BOM will be automatically detected and ignored
if the file encoding is UTF-8. WIP patch attached.

I'm thinking about only COPY FROM for reads, but if someone wants to add
BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

Comments welcome.

--
Itagaki Takahiro

Attachment Content-Type Size
copy_from_bom.patch application/octet-stream 747 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2011-09-26 06:14:03 Re: Support UTF-8 files with BOM in COPY FROM
Previous Message Robert Haas 2011-09-26 04:47:14 Re: contrib/sepgsql regression tests are a no-go