Re: create custom collation from case insensitive portuguese

From: Alexandre Riveira <alexandre(at)objectdata(dot)com(dot)br>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: create custom collation from case insensitive portuguese
Date: 2010-11-11 22:11:42
Message-ID: 4CDC6A1E.4050004@objectdata.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you all for your help !

When mentioned in Portuguese case-insensitive in fact we are also
talking about accent-insensitive

A common example is that the name Jose and José also can be written,
citext or ilike only not solve the problem

My progress is ...

Edit file /usr/share/i18n/locales/i18n e alter section tolower /, an
example:

(<U00C9>,<U00E9>) e alter for (<U00C9>,<U0065>)

LOWER reproduce: LOWER("ITAPAGÉ") => "itapage",

Example success:

SELECT * FROM endereco WHERE LOWER(logradouro) LIKE LOWER('itapage%')

this behavior is reproduced in citext (logradouro is column citext)

SELECT * FROM endereco WHERE logradouro = 'itapage'
or
SELECT * FROM endereco WHERE logradouro LIKE 'itapage%'

All examples return the desired value "ITAPAGÉ"
Issue:

SELECT * FROM endereco WHERE logradouro LIKE 'itapage%' NOT USE INDEX
I tried
CREATE INDEX cep_ik_logradouro ON cep USING btree (logradouro);
CREATE INDEX like_index ON cep(logradouro varchar_pattern_ops);
CREATE INDEX cep_ci_index ON cep(lower(logradouro) varchar_pattern_ops);

I've had success with using index
SELECT * FROM endereco WHERE LOWER(logradouro) LIKE LOWER('itapage%')
and CREATE INDEX cep_ci_index ON cep(lower(logradouro) varchar_pattern_ops)
But I want to solve using only citext

Tank's

Alexandre Riveira
Brazil

Robert Haas escreveu:
> On Tue, Nov 2, 2010 at 8:40 AM, Euler Taveira de Oliveira
> <euler(at)timbira(dot)com> wrote:
>
>> Alexandre Riveira escreveu:
>>
>>> I've achieved some success in changing collate operating system (linux)
>>> to generate sort of way of Brazil Portuguese hopes by adding the
>>> following code in LC_COLLATE
>>>
>>>
>> This was already discussed; search the archives [1] [2].
>>
>>
>>> So far, I understood the mechanism of change collate and reproduce in
>>> postgresql, and I could not generate a case-insensitive search, I
>>> believe that would change within the LC_COLLATE variable, but could not
>>> go any further than that.
>>>
>>>
>> PostgreSQL doesn't support case-insensitive searches specifying the collate
>> per column yet. Look at [3]. But you could use ILIKE or regular expression to
>> achieve that.
>>
>
> Is citext also useful for this?
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-11-11 22:21:34 Restructuring plancache.c API
Previous Message Andrew Dunstan 2010-11-11 21:12:02 Re: improved parallel make support