Re: Case-Insensitve Text Comparison

From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Case-Insensitve Text Comparison
Date: 2008-06-02 05:13:07
Message-ID: BC402551-7E1B-4561-8F64-E098C6D58241@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jun 1, 2008, at 21:08, Tom Lane wrote:

> "David E. Wheeler" <david(at)kineticode(dot)com> writes:
>> I really need case-insensitive string comparison in my database.
>
> Okay ... according to whose locale?

I'm using C. Of course you're correct that it depends on the locale, I
always forget that. But does not the Unicode standard offer up some
sort locale-independent case-insensitivity, so that it gets it right
some large percentage of the time?

>> Ideally there'd be a nice ITEXT data type (and friends, ichar,
>> ivarchar, etc.). But of course there isn't, and for years I've just
>> used LOWER() on indexes and queries to get the same result.
>
>> Only it turns out that I'm of course not getting the same result.
>
> I think that means you're not using the right locale.

What locale is right? If I have a Web app, there could be data in many
different languages in a single table/column.

>> 1. Does the use of the tolower() C function in the citext data type
>> on
>> pgfoundry basically give me the same results as using lower() in my
>> SQL has for all these years?
>
> [ broken record... ] Kinda depends on your locale. However,
> tolower()
> is 100% guaranteed not to work for multibyte encodings, so citext is
> quite useless if you're using UTF8. This is fixable, no doubt, but
> it's not fixed in the project as it stands.

Right, okay; thanks. I'm thinking about using it for email addresses
and domain names, however, so it might be adequate for those
applications.

>> 2. Isn't the ICU library distributed with PostgreSQL?
>
> Nope, it is not, and we have already pretty much determined that we
> do not want to make Postgres depend on ICU. See the archives.

Damn. Okay, thanks.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-06-02 05:18:55 Re: Case-Insensitve Text Comparison
Previous Message Joshua D. Drake 2008-06-02 04:46:20 Re: Overhauling GUCS