Re: UTF8 with BOM support in psql

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UTF8 with BOM support in psql
Date: 2009-11-18 13:52:20
Message-ID: 4B03FC14.4080703@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut wrote:
> But now we're back to the original problem. Certain editors insert BOMs
> at the beginning of the file. And that is by any definition before the
> embedded client encoding declaration. I think the only ways to solve
> this are:
>
> 1) Ignore the BOM if a client encoding declaration of UTF8 appears in a
> narrowly defined location near the beginning of the file (XML and
> PEP-0263 style). For *example*, we could ignore the BOM if the file
> starts with exactly "<BOM>\encoding UTF8\n". Would probably not work
> well in practice.
>
> 2) Parse two alternative versions of the file, one with the BOM ignored
> and one with the BOM not ignored, until you need to make a decision.
> Hilariously complicated, but would perhaps solve the problem.
>
> 3) Give up, do nothing.
>
>

4) set the client encoding before the file is read in any of the ways
that have already been discussed and then allow psql to eat the BOM.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2009-11-18 14:00:09 Re: operator exclusion constraints
Previous Message hernan gonzalez 2009-11-18 13:50:44 Re: Timezones (in 8.5?)