BUG: ILIKE with single-byte encoding

From: Rolf Jentsch <RJentsch(at)electronicpartner(dot)de>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG: ILIKE with single-byte encoding
Date: 2008-02-28 17:22:48
Message-ID: 200802281822.48864.RJentsch@electronicpartner.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

With PostgreSQL 8.3.0 the following bug has been introduced with the ILIKE or
~~* operator:

In a database with single-byte encoding as LATIN1 the expression

SELECT 'aü' ILIKE '%ü';
returns false.

This error is true for every pattern, where a % is followed by a char with a
decimal value between 128 and 255.

I was able to track down the error to the file
src/backend/utils/adt/like_match.c

For the single-byte case there are some places where a (signed) char value is
compared to the return value auf tolower() which is an int. The 'ü' in Latin1
is -4 as signed char and 252 as int as returned by tolower() which is
obviously not equal.

It could be fixed, with the appended patch.

cu
Rolf Jentsch
Entwicklung Mitglieder-Systeme Dezentral

ElectronicPartner GmbH
Mündelheimer Weg 40
40472 Düsseldorf
phone: +49-(0)211-4156-0
fax: +49-(0)211-4156-6865
eMail: rjentsch(at)electronicpartner(dot)de

Sitz der Gesellschaft Düsseldorf
Amtsgericht - Registergericht Düsseldorf - HRB 4078
Geschäftsführer: Oliver Haubrich,
Dr. Sven-Olaf Krauß, Karl Trautman

--- src/backend/utils/adt/like_match.c 2008-02-28 18:19:30.000000000
+0100
+++ src/backend/utils/adt/like_match.c 2008-02-28 18:19:43.000000000
+0100
@@ -71,7 +71,7 @@
*/

#ifdef MATCH_LOWER
-#define TCHAR(t) tolower((t))
+#define TCHAR(t) ((char)tolower((t)))
#else
#define TCHAR(t) (t)
#endif

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2008-02-29 03:36:34 Re: BUG #3995: pqSocketCheck doesn't return
Previous Message James P. Yalem 2008-02-28 16:44:35 BUG #3996: Reinstalling after uninstall