From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Encoding issues |
Date: | 2001-10-10 06:40:25 |
Message-ID: | 20011010154025N.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Receiving a request to add ISO 8859-15 and 16, I review the multibyte
support code and found several errors in it.
1) There is a confusion between "LATIN5" and ISO 8859-5. LATIN5 is not
ISO 8859-5, but is actually ISO 8859-9. Should we rename LATIN5 to
"ISO8859-5" (or whatever) as the encoding name? I think we should.
For your information, here are the correct mapping between ISO
8859-n and LATINn.
ISO 8859-1 LATIN1
ISO 8859-2 LATIN2
ISO 8859-3 LATIN3
ISO 8859-4 LATIN4
ISO 8859-9 LATIN5
ISO 8859-10 LATIN6
2) The leading characters for some Cyrillic charsets are wrong.
Currently they are defined as:
#define LC_KOI8_R 0x8c /* Cyrillic KOI8-R */
#define LC_KOI8_U 0x8c /* Cyrillic KOI8-U */
#define LC_ISO8859_5 0x8d /* ISO8859 Cyrillic */
These should be:
#define LC_KOI8_R 0x8b /* Cyrillic KOI8-R */
#define LC_KOI8_U 0x8b /* Cyrillic KOI8-U */
#define LC_ISO8859_5 0x8c /* ISO8859 Cyrillic */
The impact of correcting them would be for users who are storing
their data into database using MULE internal code. I think they
are quite few people using MULE internal code. So we could correct
them for 7.2.
Comments?
BTW, should we support ISO 8859-6 and beyond for 7.2? There have been
some requests to do that. Supporting them are actually trivial works,
should be one day job. The harder part is writing conversion function
between encodings. However, there is very few demands to do that, I
guess. If so, we could ommit the conversion capability for 7.2.
Comments?
--
Tatsuo Ishii
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2001-10-10 07:00:59 | Re: Encoding issues |
Previous Message | Mike Mascari | 2001-10-10 06:16:11 | Re: [HACKERS] What about CREATE OR REPLACE FUNCTION? |