Re: Regular expression to UPPER() a lower case string

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: Eagna <eagna(at)protonmail(dot)com>
Cc: Gianni Ceccarelli <dakkar(at)thenautilus(dot)net>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Regular expression to UPPER() a lower case string
Date: 2022-12-10 14:48:58
Message-ID: 20221210144858.vmaglmltkytgemu6@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2022-12-10 14:36:04 +0000, Eagna wrote:
> > I want to index on a REGEXP_REPLACE() - I thought using lower -> upper would be a good test.
>
> > I could always have used another REGEXP_REPLACE() for my testing,
> > but I then became "obsessed" with the idea of using
> > REGEXP_REPLACE() as a substitute for UPPER() - kind of an obfuscated
> > code competition with myself! :-)
>
> ========================
>
> So, I have no actual *_need_* for this, other than a desire to learn
> and understand what's going on and why.

You can't do that. Well, theoretically you could replace every
individual lower case letter with it's upper case equivalent:

select regexp_replace(...regexp_replace(regexp_replace(s, 'a', 'A'), 'b', 'B')... 'z', 'Z') ...

but that would be insane even for the 26 letters of the basic Latin
alphabet, much less the myriad of accented letters (and other alphabets
like Cyrillic or Greek ...).

On second thought you could probably use NFD normalization to separate
base letters from accents, uppercase the base letters and then
(optionally) NFC normalize everything again. Still insane ;-).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gianni Ceccarelli 2022-12-10 15:26:25 Re: Regular expression to UPPER() a lower case string
Previous Message Eagna 2022-12-10 14:36:04 Re: Regular expression to UPPER() a lower case string