BUG #2625: Case insensitive regexp matching doesn't work on national characters

From: "Zoltan MEZEI" <mezei(dot)zoltan(at)telefor(dot)hu>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #2625: Case insensitive regexp matching doesn't work on national characters
Date: 2006-09-13 13:25:30
Message-ID: 200609131325.k8DDPUYW059944@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 2625
Logged by: Zoltan MEZEI
Email address: mezei(dot)zoltan(at)telefor(dot)hu
PostgreSQL version: 8.0.3
Operating system: Centos Linux 3.7
Description: Case insensitive regexp matching doesn't work on
national characters
Details:

(the bug is also there in 8.1.4, used libc version is 2.3.2)

Symptom:
select '' ~* '';
false
select upper('') ~* upper('');
true

Information:
LC_CTYPE and LC_COLLATE are set to hu_HU.utf8. The database encoding is
UNICODE.

Proposed solution:
The problem is that the regex module doesn't use the functions from
wctype.h, and because of that, it cannot handle multibyte charachters'
upper() properly. It should use wctype functions and the problem is handled.
:-)

Browse pgsql-bugs by date

  From Date Subject
Next Message Fernando chamber hurtado 2006-09-13 16:34:16 BUG #2626: Control de errores
Previous Message Chris Purcell 2006-09-12 21:42:05 Re: Unexpected chunk number