Re: encoding names

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
Cc: pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Subject: Re: encoding names
Date: 2001-08-15 15:16:42
Message-ID: Pine.LNX.4.30.0108151654030.677-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Karel Zak writes:

> before some time I was discuss with Tatsuo and Thomas about support
> for synonyms of encoding names (for example allows to use
> "ISO-8859-1" as the encoding name) and use binary search for searching
> in encoding names.

Funny, I was thinking the same thing last night...

A couple of other things I was thinking about in the encoding area:

If you want to have codeset synonyms, you should also implement the
normalization of codeset names, defined as such:

1. Remove all characters beside numbers and letters.

2. Fold letters to lowercase.

3. If the same only contains digits prepend the string `"iso"'.
[quote glibc]

This allows ISO_8859-1 and iso88591 to be treated the same.

Here's a good resource of official character set names and aliases:

http://www.iana.org/assignments/character-sets

Also, we ought to have support for the ISO_8859-15 character set, or
people will spread the word that PostgreSQL is not ready for the Euro.

Then I figured, if the client is configured with locale, it should
automatically determine the client's encoding. Not sure if this is
portably possible, but it would be very nice to have.

Finally, as I've mentioned before I'd like to try out the iconv interface.
Might become an option in 7.2 even.

--
Peter Eisentraut peter_e(at)gmx(dot)net http://funkturm.homeip.net/~peter

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2001-08-15 15:32:40 Re: encoding names
Previous Message Karel Zak 2001-08-15 14:41:08 Re: encoding names