Skip site navigation (1) Skip section navigation (2)

Support UTF-8 files with BOM in COPY FROM

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Support UTF-8 files with BOM in COPY FROM
Date: 2011-09-26 04:58:42
Message-ID: CAJW2+qdYg1+xLaHDqnJs3AcKmCSVCDkv_LCAPWUtwmxL9dzVhQ@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
in COPY FROM command. BOM will be automatically detected and ignored
if the file encoding is UTF-8. WIP patch attached.

I'm thinking about only COPY FROM for reads, but if someone wants to add
BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

Comments welcome.

-- 
Itagaki Takahiro

Attachment: copy_from_bom.patch
Description: application/octet-stream (747 bytes)

Responses

pgsql-hackers by date

Next:From: David E. WheelerDate: 2011-09-26 06:14:03
Subject: Re: Support UTF-8 files with BOM in COPY FROM
Previous:From: Robert HaasDate: 2011-09-26 04:47:14
Subject: Re: contrib/sepgsql regression tests are a no-go

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group