Re: multibyte-character aware support for function "downcase_truncate_identifier()"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rajanikant Chirmade <rajanikant(dot)chirmade(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: multibyte-character aware support for function "downcase_truncate_identifier()"
Date: 2010-11-21 21:41:35
Message-ID: 26799.1290375695@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Jul 7, 2010 at 10:07 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> IIRC this is intentional. Please consult the archives for previous
>> discussions.

> Why would this be intentional?

Well, it's intentional for lack of any infrastructure that would allow
a more spec-compliant approach. As you say, calling str_tolower here
is probably a non-starter for performance reasons. Another big problem
is that str_tolower produces a locale-specific downcasing conversion.
This (a) is going to create portability headaches of the first magnitude,
and (b) is not really an advance in terms of spec compliance. The SQL
spec says that identifier case folding should be done according to the
Unicode standard, but it's not safe to assume that any random
platform-specific locale is going to act that way. A specific example
of a locale that is known to NOT behave acceptably is Turkish: they have
weird ideas about i versus I, which in fact broke things back when we
used to use tolower for this purpose. See the archives from early 2004,
and in particular commit 59f9a0b9df0d224bb62ff8ec5b65e0b187655742, which
removed the exact same logic (though not wide-character-aware) that this
patch proposes to put back.

I think the given patch can be rejected out of hand. If the OP has any
ideas about doing non-locale-dependent case folding at an acceptable
speed, I'm happy to listen.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2010-11-21 21:47:56 Re: ALTER OBJECT any_name SET SCHEMA name
Previous Message Robert Haas 2010-11-21 21:14:48 Re: multibyte-character aware support for function "downcase_truncate_identifier()"