Re: dmetaphone woes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: dmetaphone woes
Date: 2010-04-05 02:04:21
Message-ID: 422.1270433061@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> While testing pgindent the other day, I found some infelicities in
> contrib/fuzzystrmatch/dmetaphone.c. From pgindent's point of view, the
> problem is that the code contains two characters in case labels with the
> high bits set, and this blows pgindent up on my Linux box if the locale
> happens be en_US.utf8 instead of C.

Not only pgindent ...
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00308.php

> However, that doesn't solve the fundamental problem, which is that the
> code in question is pretty much broken for any encoding but Latin1.

Yeah. I don't see an easy fix for it either, but there should be a
TODO entry about it. In the meantime I'm surprised we didn't insert
octal escapes already.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-04-05 02:05:18 Re: pgindent bizarreness
Previous Message Tom Lane 2010-04-05 01:59:17 Re: default privileges