Skip site navigation (1) Skip section navigation (2)

Re: Case-Insensitve Text Comparison

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Case-Insensitve Text Comparison
Date: 2008-06-02 05:21:47
Message-ID: Pine.LNX.4.64.0806020919181.21547@sn.sai.msu.ru (view raw or flat)
Thread:
Lists: pgsql-hackers
David,

we wrote contrib module (mchar) for one customer, which ports its
application from mssql to postgres. It does case-insensitive comparison 
for new data type 'mchar' and linked with ICU for system independent locale.

Oleg

On Sun, 1 Jun 2008, David E. Wheeler wrote:

> Howdy,
>
> I'm sure I'm just showing off my ignorance here, but here goes
>
> I really need case-insensitive string comparison in my database. Ideally 
> there'd be a nice ITEXT data type (and friends, ichar, ivarchar, etc.). But 
> of course there isn't, and for years I've just used LOWER() on indexes and 
> queries to get the same result.
>
> Only it turns out that I'm of course not getting the same result. This 
> script:
>
> #!/usr/local/bin/perl -w
>
> use strict;
> use warnings;
> use utf8;
> binmode STDOUT, ':utf8';
> use DBI;
>
> my $dbh = DBI->connect('dbi:Pg:dbname=try', 'postgres', '', { pg_enable_utf8 
> => 1 });
> for my $char qw(              A B C D ) {
>   print "$char: ", $dbh->selectrow_array('SELECT LOWER(?)', undef, $char ), 
> $/;
> }
>
> Yields this output:
>
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> : 
> A: a
> B: b
> C: c
> D: d
>
> So it doesn't really work on anything other than ASCII, it looks like. So I 
> have two questions:
>
> 1. Does the use of the tolower() C function in the citext data type on 
> pgfoundry basically give me the same results as using lower() in my SQL has 
> for all these years? IOW, does it convert letters to lowercase in the same 
> way that the LOWER() SQL function does? If so, I think I might start to use 
> it for my case-insensitive columns and simplify my SQL a bit.
>
> http://pgfoundry.org/projects/citext/
>
> 2. Isn't the ICU library distributed with PostgreSQL? And if so, could it not 
> be used to create proper case conversions in LOWER() and friends and, 
> ultimately, to create a case-insensitive text type in core? I'm seeing that 
> it has a constant named U_COMPARE_IGNORE_CASE that can be used with its 
> unorm_compare() function:
>
> http://www.icu-project.org/apiref/icu4c/unorm_8h.html#6cc4c8b0d5be7ce1ac4b600ace7817f5
> http://www.icu-project.org/apiref/icu4c/unorm_8h.html#991e0fe6f0d062dd6e8e924517f3f437
>
> I don't really know C, but if that's stuff there, can't we take advantage of 
> it for proper case-insensitive comparisons (and conversions)?
>
> Thanks,
>
> David
>
>
>

 	Regards,
 		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2008-06-02 05:27:23
Subject: Re: Overhauling GUCS
Previous:From: Tom LaneDate: 2008-06-02 05:18:55
Subject: Re: Case-Insensitve Text Comparison

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group