From: | Daniel Migowski <dmigowski(at)ikoffice(dot)de> |
---|---|
To: | Kris Jurka <books(at)ejurka(dot)com> |
Cc: | pgsql-jdbc(at)postgresql(dot)org |
Subject: | Re: SET client_encoding = 'UTF8' |
Date: | 2008-05-19 08:18:01 |
Message-ID: | 483137B9.5070203@ikoffice.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
Kris Jurka schrieb:
> On Sun, 18 May 2008, Daniel Migowski wrote:
>> The command SET client_encoding = 'UTF8'
>>
>> throws an exception in the driver, because the driver expects UNICODE.
> This has been discussed before and the problem is that there are a too
> many ways to say UTF8 [1]. You can say UTF8, UTF-8, UTF -- 8, and so
> on. Perhaps we should strip all spaces and dashes prior to comparison?
This would be correct in my opinion. I think no one darse to declare a
charset name the relies on charaters other than 0-9 and a-z to be
identifiable. IMHO we should just allow the way postgres allowes by
itself (we could dig into the parsing code of postgres). I tried at the
command line, and got the following:
set client_encoding='foobar';
FEHLER: Invalid value for parameter »client_encoding«: »foobar«
set client_encoding='utf8';
OK
set client_encoding='utf-8';
OK
set client_encoding='utf -- 8';
OK
set client_encoding='Utf -- 8';
OK
set client_encoding='Utf -- 98';
FEHLER: Invalid value for parameter »client_encoding«: »Utf -- 98«
set client_encoding='Utf_8';
OK
But I think we should be right with
userencoding.toLowercase().replaceall("[^0-9a-z]","").equals("utf8"); //
untested prototype code
or something like this.
>
> [1] http://archives.postgresql.org/pgsql-jdbc/2008-02/threads.php#00174
Thanks for the link.
With best regards,
Daniel Migowski
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-05-19 14:03:44 | Re: SET client_encoding = 'UTF8' |
Previous Message | Oliver Jowett | 2008-05-18 08:37:25 | Re: SET client_encoding = 'UTF8' |