| From: | brian <brian(at)zijn-digital(dot)com> |
|---|---|
| To: | pgsql-general(at)postgresql(dot)org |
| Subject: | match accented chars with ASCII-normalised version |
| Date: | 2008-01-25 05:02:55 |
| Message-ID: | 47996D7F.3070801@zijn-digital.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
The client for a web application I'm working on wants certain URLs to
contain the full names of members ("SEO-friendly" links). Scripts would
search on, say, a member directory entry based on the name of the
member, rather than the row ID. I can easily join first & last names
with an underscore (and split on that later) and replace spaces with +,
etc. But many of the names contain multibyte characters and so the URLs
would become URL-encoded, eg:
Adelina España -> Adelina_Espa%C3%B1a
The client won't like this (and neither will I).
I can create a conversion array to replace certain characters with
'normal' ones:
Adelina_Espana
However, I then run into the problem of trying to match 'Espana' to
'España'. Searching online, I found a few ideas (soundex, intuitive
fuzzy something-or-other) but mostly they seem like overkill for this
application.
The best I can come up with is to add a 'link_name' column to the table
that holds the 'normalised' version of the name ('Adelina_Espana', or
even 'adelina_espana'). The duplication bugs me a little but the table
currently stands at a whopping ~3500 names, so I'm not too concerned.
My question is: well, does this look like the way to go, considering
it's just a web app (and isn't likely to ever top 10000 names)? Or is
there something clever (yet not overkill) that I'm missing?
If I do go this route, I'd add an insert/update trigger to call a
function (PL/Perl, I'm looking at you) that handles the conversion to
link_name.
brian
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2008-01-25 05:17:16 | Re: can't create index with 'dowcast' row |
| Previous Message | Ow Mun Heng | 2008-01-25 03:49:03 | DB wide Vacuum(Goes thru readonly tables) vs Autovacuum |