Culturally aware initcap

From: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
To: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Culturally aware initcap
Date: 2010-04-20 09:48:36
Message-ID: w2udb471ace1004200248o11f480dn7142e2016822d57@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I've devised the following function, that performs the same task as
initcap, but in a "culturally aware fashion", for English. It solves a
common problem I was having with initcap, where the string "ROSEMARY'S
baby DOESN'T LIVE HERE anymore" became "Rosemary'S Baby doesn'T Live
Here Anymore", whereas I wanted to see "Rosemary's Baby Doesn't Live
Here Anymore", while preserving Irish names like O'Shaughnessy and
O'Sullivan. This may be more useful for exclusively English language
databases than the generic initcap().

CREATE OR REPLACE FUNCTION cul_initcap(input_val text) RETURNS text AS
$function_body$

SELECT replace(
replace(
replace(
regexp_replace(initcap($1),
$$'([MST])([^[:upper:][:lower:]]|$)$$,
$$'{(at)*#!\1!#*@}\2$$,
'g'
)
, '{(at)*#!M!#*@}', 'm')
, '{(at)*#!S!#*@}', 's')
, '{(at)*#!T!#*@}', 't');

$function_body$
LANGUAGE 'sql' IMMUTABLE;

Now, this works, but is a little inelegant; I couldn't figure out a
better way of having regex_replace's replacement become lower case,
than wrapping part of its output in magical braces of {(at)*#! and !#*(at)}
and subsequently replacing those magical braces and their contents
with appropriate, lower-case strings using multiple replace() calls.
One obvious problem with this function is that it will not correctly
initcap a "magical brace enclosed literal", like '{(at)*#!T!#*@}' ,
although I dare say that isn't enough of a problem to discourage its
use.

Can someone suggest a better implementation, that doesn't rely on
magical braces? Either way, I'm going to post this on the postgres
wiki under "snippets", because I think it's of general interest, and
it currently lacks a template solution, which it probably should have.

Regards,
Peter Geoghegan

Responses

Browse pgsql-general by date

  From Date Subject
Next Message cojack 2010-04-20 09:59:44 Re: Ltree - how to sort nodes on parent node
Previous Message Alban Hertroys 2010-04-20 09:42:58 Re: Ltree - how to sort nodes on parent node