Skip site navigation (1) Skip section navigation (2)

Re: Bug #659: lower()/upper() bug on ->multibyte<- DB

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: michael(dot)enke(at)wincor-nixdorf(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Bug #659: lower()/upper() bug on ->multibyte<- DB
Date: 2002-05-09 01:06:13
Message-ID: 20020509100613P.t-ishii@sra.co.jp (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackers
> > You input "select lower('X')" as ISO-8859-1 encoded, then it is sent
> > to the backend. The backend convert it to UTF-8. Then lower() is
> > called with an UTF-8 string input. lower() calls tolower() which
> > expects the input being ISO-8859-1 since you set locale to de_DE.
> > This is the source of the problem.
> 
> Excuse me, this seems not the be the source of the problem.
> If I call select lower(table_col) from table;
> then I also don't get back the lower case character but the original case if it is a multibyte char.

This doesn't work by the same reason above. The backend extracts
table_col from the table which is encoded in UTF-8, while lower()
expects ISO-8859-1. Try:

select convert(lower(convert(table_col, 'LATIN1')),'LATIN1','UNICODE')
from your_table;

> I did now also remove all below data directory, exported LC_CTYPE to de_DE.utf8, made an initdb.
> With pg_controldata I see LC_CTYPE is de_DE.utf8
> Now I no longer get the ERROR: cannot convert UTF-8 to ISO8859-1, but the translation doesn't work:
> MB chars are not translated, I get back the original case.

I don't think using de_DE.utf8 helps. The locale support just calls
tolower(), which is not be able to handle multibyte chars.

> > Oops. That should be:
> > 
> > select convert(lower(convert('X', 'LATIN1')),'LATIN1','UNICODE');
> > It looks ugly, but works.
> 
> Sorry, it doesn't work. The same here, I get back the case I put in at X, not the lower case.

Are you sure to use de_DE locale (not de_DE.utf8)?
Included are sample scripts being work with me using de_DE locale.
Here is also my pg_controldata output.

$ pg_controldata
pg_control version number:            71
Catalog version number:               200201121
Database state:                       IN_PRODUCTION
pg_control last modified:             Thu May  9 08:37:20 2002
Current log file id:                  0
Next log file segment:                1
Latest checkpoint location:           0/18C860
Prior checkpoint location:            0/1503A0
Latest checkpoint's REDO location:    0/172054
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's StartUpID:        8
Latest checkpoint's NextXID:          217
Latest checkpoint's NextOID:          24748
Time of latest checkpoint:            Thu May  9 08:37:17 2002
Database block size:                  8192
Blocks per segment of large relation: 131072
LC_COLLATE:                           de_DE
LC_CTYPE:                             de_DE
--
Tatsuo Ishii

In response to

Responses

pgsql-hackers by date

Next:From: Tatsuo IshiiDate: 2002-05-09 01:27:01
Subject: Re: Bug #659: lower()/upper() bug on ->multibyte<- DB
Previous:From: Matthew KirkwoodDate: 2002-05-09 00:25:40
Subject: Re: HEADS UP: Win32/OS2/BeOS native ports

pgsql-bugs by date

Next:From: Tatsuo IshiiDate: 2002-05-09 01:27:01
Subject: Re: Bug #659: lower()/upper() bug on ->multibyte<- DB
Previous:From: Enke, MichaelDate: 2002-05-08 14:54:55
Subject: Re: Bug #659: lower()/upper() bug on ->multibyte<- DB

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group