From: | Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: UTF8 with BOM support in psql |
Date: | 2009-11-17 07:40:23 |
Message-ID: | 20091117164023.1513.52131E4D@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> I think I could support using the presence of the BOM as a fall-back
> indicator of encoding in absence of any other declaration.
What is the difference the fall-back and <<set client encoding to UTF-8
if BOM found>> ? I read this discussion that we cannot accept any automatic
encoding detections (properly speaking, detection is ok, but automatic
assignment is not). We should not have any fall-back mechanism, no?
> Also, when the proposed patch to set the encoding from the locale
> appears, we need to make this logic more precise.
Encoding-from-locale feature will be useful, but the patch does *not*
set any encodings. The reason is same as above.
> Also, I'm not sure if we need this logic only when we send a query. It
> might be better to do this in the lexer when we find a non-ASCII
> character and we don't have a client encoding != SQL_ASCII set yet.
Absolutely, but is it an indepedent issue from BOM? Multi-byte scripts
without encoding are always dangerous whether BOM is present or not.
I'd say we can always throw an error when we find queries that contain
multi-byte characters if no prior encoding declaration.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Chuck McDevitt | 2009-11-17 08:59:25 | Re: UTF8 with BOM support in psql |
Previous Message | Markus Wanner | 2009-11-17 07:31:12 | Re: write ahead logging in standby (streaming replication) |