Re: encoding names

From: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Subject: Re: encoding names
Date: 2001-08-15 15:37:31
Message-ID: 20010815173731.B28510@zf.jcu.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 15, 2001 at 05:16:42PM +0200, Peter Eisentraut wrote:
> Karel Zak writes:
>
> > before some time I was discuss with Tatsuo and Thomas about support
> > for synonyms of encoding names (for example allows to use
> > "ISO-8859-1" as the encoding name) and use binary search for searching
> > in encoding names.
>
> Funny, I was thinking the same thing last night...

:-)

> A couple of other things I was thinking about in the encoding area:
>
> If you want to have codeset synonyms, you should also implement the
> normalization of codeset names, defined as such:
>
> 1. Remove all characters beside numbers and letters.
>
> 2. Fold letters to lowercase.
>
> 3. If the same only contains digits prepend the string `"iso"'.
> [quote glibc]
>
> This allows ISO_8859-1 and iso88591 to be treated the same.

My idea is (was:-) create table with all "standard" synonyms and
search in this table case insensitive. Something like:

PGencname pg_encnames[] =
{
{ "ISO-8859-1", LATIN1 },
{ "LATIN1", LATIN1 }
};

But your idea with encoding name "clearing" (remove irrelavant chars)
is more cool.

> Here's a good resource of official character set names and aliases:
>
> http://www.iana.org/assignments/character-sets

Thanks.

> Also, we ought to have support for the ISO_8859-15 character set, or
> people will spread the word that PostgreSQL is not ready for the Euro.

It require prepare some conversion functions and tables (UTF). Tatsuo?

> Finally, as I've mentioned before I'd like to try out the iconv interface.

Do you want integrate iconv stuff to current PG multibyte routines or as
some extension (functions?) only?

BTW, is on psql some \command that print list of all supported
encodings?

Maybe allows use something like: SELECT pg_encoding_names();

Karel

--
Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Brook Milligan 2001-08-15 15:38:37 Re: [PATCHES] Re: PostGIS spatial extensions
Previous Message Peter Eisentraut 2001-08-15 15:32:40 Re: encoding names