Re: UTF8 with BOM support in psql

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UTF8 with BOM support in psql
Date: 2009-10-20 15:13:19
Message-ID: 4ADDD38F.50804@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>
>> Seems there is community support for accepting BOM:
>> http://archives.postgresql.org/pgsql-hackers/2009-09/msg01625.php
>>
>
> That discussion has approximately nothing to do with the
> much-more-invasive change that Itagaki-san is suggesting.
>
> In particular I think an automatic change of client_encoding isn't
> particularly a good idea --- wouldn't you have to change it back later,
> and is there any possibility of creating a security issue from such
> behavior? Remember that client_encoding *IS* tied to security issues
> such as backslash escape handling.
>
>
>

Yeah, I don't think we should be second-guessing the user on the encoding.

What I think we might sensibly do is to eat the leading BOM of an SQL
file iff the client encoding is UTF8, and otherwise treat it as just
bytes in whatever the encoding is.

Should we also do the same for files passed via \copy? What about
streams on stdin? What about files read from the backend via COPY?

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-10-20 15:17:54 Re: Application name patch - v2
Previous Message Dave Page 2009-10-20 15:07:05 Re: Re: BUG #5065: pg_ctl start fails as administrator, with "could not locate matching postgres executable"