Re: Case Conversion Fix for MB Chars

From: Volkan YAZICI <volkan(dot)yazici(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Case Conversion Fix for MB Chars
Date: 2005-11-28 14:49:54
Message-ID: 7104a7370511280649p72f4f302p406a57ce105b0365@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches pgsql-tr-genel

On 11/27/05, Volkan YAZICI <volkan(dot)yazici(at)gmail(dot)com> wrote:
> Tests made on an i686 with a
> 2.6.12.5 kernel. Here's a short list of cases I tried with both latin5
> and unicode charsets:
> - lower/upper functions with Turkish characters.
> - ILIKE matches with both lower and upper case Turkish characters.
> (Above testes succeeded for non-Turkish characters too.)

I read the above paragraph again and realized the out of usability of
it. Here's a modified one:

Test's made on a Debian GNU/Linux (stable) 3.1 by patching
src/backend/utils/adt/like.c (r1.62) and
src/backend/utils/adt/oracle_compat.c (r1.64) files. Related software
versions:
- gcc-3.3 [3.3.5-13]
- libc6-dev [2.3.2.ds1-22]
- locales [2.3.2.ds1-22]

Tried test cases using patched CVS HEAD:

[For Latin5]
$ usr/bin/initdb -D var/data
$ LANG="tr_TR.ISO-8859-9" usr/bin/postmaster -D var/data
$ usr/bin/createdb -E latin5 test_latin5
$ usr/bin/psql test_latin5
Welcome to psql 8.2devel, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

test_latin5=# SHOW client_encoding;
client_encoding
-----------------
LATIN5
(1 row)

test_latin5=# SELECT upper('abcdefgğhıijklmnoöprsştuüvyz qwx 0123456789');
upper
-------------------------------------------
ABCDEFGĞHIİJKLMNOÖPRSŞTUÜVYZ QWX 0123456789
(1 row)

test_latin5=# SELECT
test_latin5-# lower('ABCDEFGĞHIİJKLMNOÖPRSŞTUÜVYZ QWX 0123456789');
lower
---------------------------------------------
abcdefgğhıijklmnoöprsştuüvyz qwx 0123456789
(1 row)

test_latin5=# BEGIN;
BEGIN
test_latin5=# CREATE TEMP TABLE t (v varchar);
CREATE TABLE
test_latin5=# COPY t FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> ı123
>> I123
>> i123
>> İ123
>> \.
test_latin5=# SELECT v FROM t;
v
------
ı123
I123
i123
İ123
(4 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'ı%';
v
------
ı123
I123
(2 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'I%';
v
------
ı123
I123
(2 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'i%';
v
------
i123
İ123
(2 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'İ%';
v
------
i123
İ123
(2 rows)

test_latin5=# ROLLBACK;
ROLLBACK

[For UNICODE]
Same steps as above with LANG="tr_TR.UTF-8" and database/client
encoding as UNICODE.

Hope this tests help.

Regards.

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2005-11-28 15:18:08 Re: Reduce dependancies of postmaster (without --as-needed)
Previous Message Alvaro Herrera 2005-11-28 11:21:50 Re: Reduce dependancies of postmaster (without --as-needed)

Browse pgsql-tr-genel by date

  From Date Subject
Next Message Devrim GUNDUZ 2005-11-29 18:30:13 [Duyuru] İstanbul'da PostgreSQL Semineri
Previous Message Devrim GÜNDÜZ 2005-11-28 14:26:08 PostgreSQL Haftalık Haber Bülteni - 28 Kasım 2005