Re: SET client_encoding = 'UTF8'

From: Daniel Migowski <dmigowski(at)ikoffice(dot)de>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: SET client_encoding = 'UTF8'
Date: 2008-05-19 08:18:01
Message-ID: 483137B9.5070203@ikoffice.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Kris Jurka schrieb:
> On Sun, 18 May 2008, Daniel Migowski wrote:
>> The command SET client_encoding = 'UTF8'
>>
>> throws an exception in the driver, because the driver expects UNICODE.
> This has been discussed before and the problem is that there are a too
> many ways to say UTF8 [1]. You can say UTF8, UTF-8, UTF -- 8, and so
> on. Perhaps we should strip all spaces and dashes prior to comparison?
This would be correct in my opinion. I think no one darse to declare a
charset name the relies on charaters other than 0-9 and a-z to be
identifiable. IMHO we should just allow the way postgres allowes by
itself (we could dig into the parsing code of postgres). I tried at the
command line, and got the following:

set client_encoding='foobar';
FEHLER: Invalid value for parameter »client_encoding«: »foobar«

set client_encoding='utf8';
OK

set client_encoding='utf-8';
OK

set client_encoding='utf -- 8';
OK

set client_encoding='Utf -- 8';
OK

set client_encoding='Utf -- 98';
FEHLER: Invalid value for parameter »client_encoding«: »Utf -- 98«

set client_encoding='Utf_8';
OK

But I think we should be right with


userencoding.toLowercase().replaceall("[^0-9a-z]","").equals("utf8"); //
untested prototype code

or something like this.
>
> [1] http://archives.postgresql.org/pgsql-jdbc/2008-02/threads.php#00174
Thanks for the link.

With best regards,
Daniel Migowski

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Tom Lane 2008-05-19 14:03:44 Re: SET client_encoding = 'UTF8'
Previous Message Oliver Jowett 2008-05-18 08:37:25 Re: SET client_encoding = 'UTF8'