Skip site navigation (1) Skip section navigation (2)

Re: Locale-dependent case conversion in {identifier}

From: Nicolai Tufar <ntufar(at)apb(dot)com(dot)tr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale-dependent case conversion in {identifier}
Date: 2002-11-30 07:57:44
Message-ID: 3DE86F78.9000905@apb.com.tr (view raw or flat)
Thread:
Lists: pgsql-advocacypgsql-generalpgsql-hackers
By no means I would try to convince that your reading of
the SQL standards is wrong. What I am trying to tell is
that Turkish alphabet is broken beyond repair. And since
there is absolutely no way to change our alphabet, we
may can code a workaround in the code.

So i do not claim that your code is wrong. It is
behaviang according to specification. But unfortunately
folks at SQL99 probably were not aware of the woes
of Turkish "I".

The very special case of letter "I" in Turkish is not
only PostgreSQL's problem. Many java programs have
failed miserably trying to open files with "I"s in
pathnames.

So basically, there are two letters "I" in Trukish.
The wone is with dot on top and another is without.
The with dot on top walways has the dot and the one
without never has it. Simple. The problem is
with the standard Latin "I". So why small "i" does
have a dot and capital "I" does not?

Standard conversion is
Lower: "I" -> "y'" and "Y'" -> "i".
Upper: "y'"  -> "I" and "i" -> "Y'".
(font may not be displayed correctly in your mail reader)

Historically programs that operate in Turkish locale have
chosen to hardcode the capitalisation of "i" in system
messages and identifier names like this:

Lower: "I" -> "i" and "Y'" -> "i".
Upper: "y'"  -> "I" and "i" -> "I".

With this, no matter what kind of "I" you used in names,
it is always going to end up a valid ASCII character.

Would it be acceptable if I submit a path that applies this
special logic in src/backend/parser/scan.l if the locale is "tr_TR"?

Because for many folks setting locale to Turkish would
render their database unusable. For, god forbid, if your
sql has a column name written in capitlas including "I".
It is not working. So I deeply believe that PostgreSQL community
have to provide a workaround for this problem.

So what should I do?

Best regards,
Nick




Tom Lane wrote:
> "Nicolai Tufar" <ntufar(at)apb(dot)com(dot)tr> writes:
> 
>>So I have changed lower-case conversion code in scan.l to make it purely
>>ASCII-based.
>>as in keywords.c. Mini-patch is given below.
> 
> 
> Rather than offering a patch, you need to convince us why our reading of
> the SQL standard is wrong.  ("Oracle does it that way" is not an
> argument that will carry a lot of weight.)
> 
> SQL99 states that identifier case conversions are done on the basis of
> the Unicode upper/lower case equivalences, so it seems clear that they
> intend more than ASCII-only conversion for identifiers.  Locale-based
> conversion might not be an exact implementation of the spec, but it's
> surely closer than ASCII-only.
> 
> 			regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly




In response to

Responses

pgsql-hackers by date

Next:From: Neil ConwayDate: 2002-11-30 08:06:16
Subject: Re: 7.4 Wishlist
Previous:From: Alvaro HerreraDate: 2002-11-30 05:55:07
Subject: Re: 7.4 Wishlist

pgsql-advocacy by date

Next:From: Neil ConwayDate: 2002-11-30 08:06:16
Subject: Re: 7.4 Wishlist
Previous:From: Josh BerkusDate: 2002-11-30 06:41:04
Subject: Re: Press Release status?

pgsql-general by date

Next:From: Hubert depesz LubaczewskiDate: 2002-11-30 08:03:15
Subject: Re: strange pg_stats behaviour?
Previous:From: Joel BurtonDate: 2002-11-30 07:45:44
Subject: Re: SQL Query

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group