Quick Links

Re: Support UTF-8 files with BOM in COPY FROM

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, david(at)kineticode(dot)com, itagaki(dot)takahiro(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Support UTF-8 files with BOM in COPY FROM
Date:	2011-09-26 18:49:16
Message-ID:	1317062957.29925.11.camel@vanquo.pezone.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On mån, 2011-09-26 at 14:44 -0400, Robert Haas wrote:
> > We did recently accept a patch for psql -f to skip over a UTF-8
> > byte-order mark. We had a lot of this same discussion there.
>
> But that case is different, because zero-width, non-breaking space has
> no particular meaning in an SQL script - it's either going to be
> ignored as a BOM, ignored as whitespace, or an error. But inside a
> file being subjected to COPY it might be confusable with data that the
> user wanted to end up in some table.

Yes, my point was more directed toward the discussion about whether BOM
in UTF-8 are valid at all. But your point pretty much kills this
altogether. If I store a BOM in row 1, column 1 of my table, because,
well, maybe it's an XML document or something, then it needs to be able
to survive a copy out and in. The only way we could proceed with this
would be if we prohibited BOMs in all user-data.

In response to

Re: Support UTF-8 files with BOM in COPY FROM at 2011-09-26 18:44:12 from Robert Haas

Responses

Re: Support UTF-8 files with BOM in COPY FROM at 2011-09-27 14:21:37 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Brar Piening	2011-09-26 18:57:25	Re: Support UTF-8 files with BOM in COPY FROM
Previous Message	Andrew Dunstan	2011-09-26 18:47:06	Re: Support UTF-8 files with BOM in COPY FROM