Re: latin5 İçin ILIKE Yaması

From: Devrim GUNDUZ <devrim(at)gunduz(dot)org>
To: Volkan YAZICI <volkan(dot)yazici(at)gmail(dot)com>
Cc: pgsql-tr-genel(at)postgresql(dot)org
Subject: Re: latin5 İçin ILIKE Yaması
Date: 2005-11-26 17:13:48
Message-ID: Pine.LNX.4.63.0511261708170.7048@mail.kivi.com.tr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-tr-genel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Merhaba,

On Sat, 26 Nov 2005, Volkan YAZICI wrote:

> On 11/26/05, Devrim GUNDUZ <devrim(at)gunduz(dot)org> wrote:
>> -hackers'a göndersene? Orada çok daha yararlı tartışmalar dönecektir.
>
> pgsql-patches'a göndereceğim ama listeyi boş yere meşgul etmek
> istemiyorum. İlk önce biz gerekli denememizi yapalım - yani bizim
> işimizi (daha önceden çalışan bir şeyi bozmadan) gördüğünü bilelim -
> ardından CVS'e commit edilmek üzere PostgreSQL tayfasına sorarız.

Haklısın. Bir de şey sorayım: Bu patch sadece latin5'i düzelttiğine göre
genel kabul görebilir mi sence? Kod politikasını sen de iyi biliyorsun; o
yüzden sorayım dedim...

Şimdiii... Ben pazar günü binarylerde denerim bunu.

Görüşürüz.
- --
Devrim GUNDUZ
Kivi Bilişim Teknolojileri - http://www.kivi.com.tr
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDiJfR4zE8DGqpiZARAgldAKCSLrerHEhZeKEuYcJKCN1IZzSFowCgpfya
RJfwNE5DDAwmvQdngAgw6Fg=
=NHCy
-----END PGP SIGNATURE-----
>From pgsql-tr-genel-owner(at)postgresql(dot)org Sat Nov 26 15:48:02 2005
X-Original-To: pgsql-tr-genel-postgresql(dot)org(at)localhost(dot)postgresql(dot)org
Received: from localhost (av.hub.org [200.46.204.144])
by svr1.postgresql.org (Postfix) with ESMTP id 535DEDA197
for <pgsql-tr-genel-postgresql(dot)org(at)localhost(dot)postgresql(dot)org>; Sat, 26 Nov 2005 15:48:01 -0400 (AST)
Received: from svr1.postgresql.org ([200.46.204.71])
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
with ESMTP id 93630-02
for <pgsql-tr-genel-postgresql(dot)org(at)localhost(dot)postgresql(dot)org>;
Sat, 26 Nov 2005 19:47:59 +0000 (GMT)
X-Greylist: domain auto-whitelisted by SQLgrey-
Received: from xproxy.gmail.com (xproxy.gmail.com [66.249.82.194])
by svr1.postgresql.org (Postfix) with ESMTP id 56EE1DA075
for <pgsql-tr-genel(at)postgresql(dot)org>; Sat, 26 Nov 2005 15:47:57 -0400 (AST)
Received: by xproxy.gmail.com with SMTP id t14so741682wxc
for <pgsql-tr-genel(at)postgresql(dot)org>; Sat, 26 Nov 2005 11:47:56 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
s=beta; d=gmail.com;
h=received:message-id:date:from:to:subject:mime-version:content-type;
b=kJAK4KEAFPyvD8ZiDYlLz6TFU0gcX6FvAWN054fyPbJY6t3MbtL05/kYPZ6xrv4yLAQm6dK+92Zt4ygm+x7C9MO1iH0Q35uVLQjy5bPcCYSEmGwalEEzjwOCyeuwawgmJMFL7QBY355jK4G6v5TYBT6WdjEVGoqtDYWhfQJKRdk=
Received: by 10.65.84.5 with SMTP id m5mr45404qbl;
Sat, 26 Nov 2005 11:47:56 -0800 (PST)
Received: by 10.65.114.20 with HTTP; Sat, 26 Nov 2005 11:47:56 -0800 (PST)
Message-ID: <7104a7370511261147t10958f57y7631c0e324d096c(at)mail(dot)gmail(dot)com>
Date: Sat, 26 Nov 2005 21:47:56 +0200
From: Volkan YAZICI <volkan(dot)yazici(at)gmail(dot)com>
To: pgsql-patches(at)postgresql(dot)org, pgsql-tr-genel(at)postgresql(dot)org
Subject: Case Conversion Fix for MB Chars
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_37336_13553736.1133034476289"
X-Virus-Scanned: by amavisd-new at hub.org
X-Spam-Status: No, score=0.262 required=5 tests=[AWL=0.262]
X-Spam-Score: 0.262
X-Spam-Level:
X-Archive-Number: 2005114/11
X-Sequence-Number: 380

------=_Part_37336_13553736.1133034476289
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Here's small patch to fix case conversion problems for ILIKE and
(Oracle Compat.) lower/upper functions. (Related bug report is here:
http://archives.postgresql.org/pgsql-bugs/2005-10/msg00001.php)

In tests it succeeded for Turkish characters while using LATIN5
encoding. But when encoding is UNICODE it still doesn't work. (IMHO,
for latin-N encodings there will be no compatibility problems; for
Unicode, I've no idea.)

Regards.

------=_Part_37336_13553736.1133034476289
Content-Type: application/octet-stream; name=case_conversion.patch
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="case_conversion.patch"

Index: src/backend/utils/adt/like.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/like.c,v
retrieving revision 1.62
diff -u -r1.62 like.c
--- src/backend/utils/adt/like.c 15 Oct 2005 02:49:28 -0000 1.62
+++ src/backend/utils/adt/like.c 26 Nov 2005 19:46:24 -0000
@@ -72,38 +72,32 @@
*/
#define CHARMAX 0x80

-static int
-iwchareq(char *p1, char *p2)
+static pg_wchar
+mbtolower(char *t)
{
- pg_wchar c1[2],
- c2[2];
+ pg_wchar c[2];
int l;

- /*
- * short cut. if *p1 and *p2 is lower than CHARMAX, then we could assume
- * they are ASCII
- */
- if ((unsigned char) *p1 < CHARMAX && (unsigned char) *p2 < CHARMAX)
- return (tolower((unsigned char) *p1) == tolower((unsigned char) *p2));
+ l = pg_mblen(t);
+ (void) pg_mb2wchar_with_len(t, c, l);
+ return tolower(c[0]);
+}

- /*
- * if one of them is an ASCII while the other is not, then they must be
- * different characters
- */
- else if ((unsigned char) *p1 < CHARMAX || (unsigned char) *p2 < CHARMAX)
- return (0);
+static int
+iwchareq(char *p1, char *p2)
+{
+ pg_wchar c1, c2;

/*
- * ok, p1 and p2 are both > CHARMAX, then they must be multibyte
- * characters
+ * Lowercasing by looking at if the character is
+ * ASCII (< CHARMAX) or not.
*/
- l = pg_mblen(p1);
- (void) pg_mb2wchar_with_len(p1, c1, l);
- c1[0] = tolower(c1[0]);
- l = pg_mblen(p2);
- (void) pg_mb2wchar_with_len(p2, c2, l);
- c2[0] = tolower(c2[0]);
- return (c1[0] == c2[0]);
+ c1 = ((unsigned char) *p1 < CHARMAX)
+ ? tolower((unsigned char) *p1) : mbtolower(p1);
+ c2 = ((unsigned char) *p2 < CHARMAX)
+ ? tolower((unsigned char) *p2) : mbtolower(p2);
+
+ return (c1 == c2);
}

#define CHAREQ(p1, p2) wchareq(p1, p2)
Index: src/backend/utils/adt/oracle_compat.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/oracle_compat.c,v
retrieving revision 1.64
diff -u -r1.64 oracle_compat.c
--- src/backend/utils/adt/oracle_compat.c 4 Nov 2005 22:19:04 -0000 1.64
+++ src/backend/utils/adt/oracle_compat.c 26 Nov 2005 19:46:25 -0000
@@ -258,6 +258,22 @@
#define wcstotext win32_wcstotext
#endif /* WIN32 */

+static pg_wchar
+mb2wchar_with_len(char inp)
+{
+ char t[1];
+ pg_wchar c[2];
+ int l;
+
+ t[0] = inp;
+ l = pg_mblen(t);
+ (void) pg_mb2wchar_with_len(t, c, l);
+ return c[0];
+}
+
+#define mbtolower(t) tolower(mb2wchar_with_len(t))
+#define mbtoupper(t) toupper(mb2wchar_with_len(t))
+

/********************************************************************
*
@@ -293,7 +309,7 @@
workspace = texttowcs(string);

for (i = 0; workspace[i] != 0; i++)
- workspace[i] = towlower(workspace[i]);
+ workspace[i] = mbtolower(workspace[i]);

result = wcstotext(workspace, i);

@@ -359,7 +375,7 @@
workspace = texttowcs(string);

for (i = 0; workspace[i] != 0; i++)
- workspace[i] = towupper(workspace[i]);
+ workspace[i] = mbtoupper(workspace[i]);

result = wcstotext(workspace, i);

------=_Part_37336_13553736.1133034476289--

In response to

Browse pgsql-tr-genel by date

  From Date Subject
Next Message Enver ALTIN 2005-11-26 17:57:21 Re: Pg_Dump
Previous Message Devrim GUNDUZ 2005-11-26 12:17:32 Re: latin5 İçin ILIKE Yaması