Re: String encoding during connection "handshake"

From: sulfinu(at)gmail(dot)com
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Usama Munir <usama(dot)munir(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: String encoding during connection "handshake"
Date: 2007-11-28 15:54:05
Message-ID: 200711281754.05364.sulfinu@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn,

:) don't take it personal, I am just trying to obtain confirmation that I
understood well the problem. Afterall, it's just that C has a very outdated
notion of "char"s (and no notion of Unicode). I was naively under the
impression that "char"s have evolved in nowadays C.

Regarding the problem of "One True Encoding", the answer seems obvious to me:
use only one encoding per database cluster, either UTF-8 or UTF-16 or another
Unicode-aware scheme, whichever yields a statistically smaller database for
the languages employed by the users in their data. This encoding should be a
one time choice! De facto, this is already happening now, because one cannot
change collation rules after a cluster has been created.

During the handshake, all clients should be assumed to serve data in the
cluster's encoding.

Have a nice day, too.

On Wednesday 28 November 2007, Martijn van Oosterhout wrote:
> On Wed, Nov 28, 2007 at 11:39:33AM +0200, sulfinu(at)gmail(dot)com wrote:
> > During the authentication phase, no such conversion takes place - you
> > were right and I couldn't believe it! In the case when your database
> > name, your user name or password contain non-ASCII characters, you're out
> > of luck if the stored values were submitted in another encoding by the
> > administrator.
>
> The problem is, what conversion. You don't know the encoding of the
> server yet (because you havn't selected a DB) and you don't know the
> encoding to the client. The only real possibility is to declare One
> True Encoding and decree every username/password be in that. But you're
> never going to get people to agree on that.
>
> > I assume that no names conversion takes place between client and cluster
> > metadata when a role is created (CREATE ROLE... PASSWORD...) or when a
> > database is created (CREATE DATABASE...). Or does it? In that case, the
> > names are encoded in the encoding of the database that the administrator
> > was connected to.
>
> Honestly, UNIX usernames/passwords have always worked like this so
> we're not really doing anything wierd by doing it this way. Users need
> to type the password in the same encoding it was added. It not usually
> a big deal because people set their own passwords...
>
> Have a nice day,

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rudolf van der Leeden 2007-11-28 15:59:52 Re: PG 8.3beta3 Segmentation Fault during Database Restore
Previous Message Louis-David Mitterrand 2007-11-28 15:51:17 Re: 8.3beta3 ERROR: cached plan must not change result type