Re: Turkish locale bug

From: Sezai YILMAZ <sezaiy(at)ata(dot)cs(dot)hun(dot)edu(dot)tr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Turkish locale bug
Date: 2001-02-20 09:24:59
Message-ID: 3A9237EB.7B8818F9@ata.cs.hun.edu.tr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Tom Lane wrote:
>
> Sezai YILMAZ <sezaiy(at)ata(dot)cs(dot)hun(dot)edu(dot)tr> writes:
> > With Turkish locale it is not possible to write SQL queries in
> > CAPITAL letters. SQL identifiers like "INSERT" and "UNION" first
> > are downgraded to "ınsert" and "unıon". Then "ınsert" and "unıon"
> > does not match as SQL identifier.
>
> Ugh.
>
> > for(i = 0; yytext[i]; i++)
> > if (isascii((unsigned char)yytext[i]) &&
> > isupper(yytext[i]))
> > yytext[i] = tolower(yytext[i]);
>
> > I think it should be better to use another thing which does what
> > function tolower() does but only in English language. This should
> > stay in English locale. I think this will solve the problem.
>
> > yytext[i] += 32;
>
> Hm. Several problems here:
>
> (1) This solution would break in other locales where isupper() may
> return TRUE for characters other than 'A'..'Z'.
>
> (2) We could fix that by gutting the isascii/isupper test as well,
> reducing it to "yytext[i] >= 'A' && yytext[i] <= 'Z'", but I'd prefer to
> still be able to say that "identifiers fold to lower case" works for
> whatever the local locale thinks is upper and lower case. It would be
> strange if identifier folding did not agree with the SQL lower()
> function.
>
> (3) I do not like the idea of hard-wiring knowledge of ASCII encoding
> here, even if it's unlikely that anyone would ever try to run Postgres
> on a non-ASCII-based system.
>
> I see your problem, but I'm not sure of a solution that doesn't have bad
> side-effects elsewhere. Ideas anyone?
>
> regards, tom lane

You are right. What about this one?

================================================================
{identifier} {
int i;
ScanKeyword *keyword;

/* I think many platforms understands the
following and sets locale to 7-bit ASCII
character set (English) */

setlocale(LC_ALL, "C");

for(i = 0; yytext[i]; i++)
if (isascii((unsigned char)yytext[i]) &&
isupper(yytext[i]))
yytext[i] = tolower(yytext[i]);

/* This sets locale to default locale which
user prefer to use */

setlocale(LC_ALL, "");
================================================================

This works on my Linux box. But, I am not sure with other
platforms. What do you think about performance?

regards
-sezai

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Rainer Mager 2001-02-20 09:58:18 JDBC bug in 7.1b4
Previous Message Sezai YILMAZ 2001-02-20 09:00:02 Re: Re: Turkish locale bug

Browse pgsql-hackers by date

  From Date Subject
Next Message Pete Forman 2001-02-20 09:40:30 Re: floating point representation
Previous Message Sezai YILMAZ 2001-02-20 09:00:02 Re: Re: Turkish locale bug