Re: BUG #1859: 3-octet private use UTF8 chars reported as identical

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Nathan Culwell-Kanarek" <nculwell(at)wisc(dot)edu>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #1859: 3-octet private use UTF8 chars reported as identical
Date: 2005-09-02 14:53:40
Message-ID: 14196.1125672820@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"Nathan Culwell-Kanarek" <nculwell(at)wisc(dot)edu> writes:
> Description: 3-octet private use UTF8 chars reported as identical

> We've run into a problem, which is that
> the PostgreSQL backend is interpreting 4 of the private use characters as
> being equivalent.

Your beef is actually with strcoll(); we just believe whatever that
function tells us when comparing strings. Check to see that you've
initdb'd in a utf8-based locale --- if not, that might be the source
of the problem. (IMHO, strcoll ought not claim distinct byte sequences
are equal in any case, but it seems some locale definitions will do
that.) If no luck, take it up with Red Hat's glibc folk.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2005-09-02 14:58:12 Re: BUG #1858: setting search path in select doesn't (always) work
Previous Message Tom Lane 2005-09-02 14:25:24 Re: Sorting Problem in UNICODE/german