Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS
Date: 2011-06-08 00:52:05
Message-ID: BANLkTim+OU9f++yR67No6sHfuQYkhR4Veg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/6/7 Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>:
> since we smash the identifier to lower case using
> downcase_truncate_identifier() function, the solution is to make this
> function should be wide-char aware, like LOWER() function functionality.
>
> I see some discussion related to downcase_truncate_identifier() and
> wide-char aware function, but seems like we lost somewhere.
> (http://archives.postgresql.org/pgsql-hackers/2010-11/msg01385.php)
> This invalid byte sequence issue seems like a more serious issue, because it
> might lead e.g to pg_dump failures.

It's a problem, but without an efficient algorithm for Unicode case
folding, any fix we attempt to implement seems like it'll just be
moving the problem around.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-06-08 01:25:30 Re: 9.1 release scheduling (was Re: reducing the overhead of frequent table locks - now, with WIP patch)
Previous Message Robert Haas 2011-06-08 00:48:50 Re: Domains versus polymorphic functions, redux