Quick Links

Re: UTF8 with BOM support in psql

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: UTF8 with BOM support in psql
Date:	2009-11-16 20:37:07
Message-ID:	1258403827.21773.9.camel@vanquo.pezone.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On ons, 2009-10-21 at 13:11 +0900, Itagaki Takahiro wrote:
> Sure. Client encoding is declared in body of a file, but BOM is
> in head of the file. So, we should always ignore BOM sequence
> at the file head no matter what client encoding is used.
>
> The attached patch replace BOM with while spaces, but it does not
> change client encoding automatically. I think we can always ignore
> client encoding at the replacement because SQL command cannot start
> with BOM sequence. If we don't ignore the sequence, execution of
> the script must fail with syntax error.

OK, I think the consensus here is:

- Eat BOM at beginning of file (as you implemented)

- Only when client encoding is UTF-8 --> please fix that

I'm not sure if replacing a BOM by three spaces is a good way to
implement "eating", because it might throw off a column indicator
somewhere, say, but I couldn't reproduce a problem. Note that the U
+FEFF character is defined as *zero-width* non-breaking space.

In response to

Re: UTF8 with BOM support in psql at 2009-10-21 04:11:59 from Itagaki Takahiro

Responses

Re: UTF8 with BOM support in psql at 2009-11-16 21:01:53 from Tom Lane
Re: UTF8 with BOM support in psql at 2009-11-17 00:31:51 from Itagaki Takahiro
Re: UTF8 with BOM support in psql at 2009-11-21 23:59:18 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andreas Kretschmer	2009-11-16 20:53:42	Re: Update on Insert
Previous Message	Thom Brown	2009-11-16 19:46:29	Re: [HACKERS] Update on Insert