Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .

From: "Brian Ceccarelli" <ceccareb(at)talusmusic(dot)com>
To: "'Greg Stark'" <gsstark(at)mit(dot)edu>, "'Brian Ceccarelli'" <ceccareb(at)talussoftware(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .
Date: 2009-08-26 01:38:31
Message-ID: 000901ca25ed$ebf09590$c3d1c0b0$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Actually Greg . . .

After a long and careful study of this, this is a Postgres bug.
Postgres is boogering up the sort. The operators < >, and the "order by"
in the select statement are both case-sensitive and case-insensitive. It
is inconsistent. It does not matter what my collating sequence is. This
is the behavior I am seeing:

If the words are the same words but letters have different case, then the
operator is case-sensitive.
If the words are not the same words, then the operator is case-insensitive
until the operator reaches the character position in both strings where the
letters become different.

It is a combination of both case-sensitive and case-insenstive. Completely
bogus. It does not matter what computer I am using. I get the same
bogus behavior on all my operating systems and Postgres versions.

Brian Ceccarelli
Talus Software
4605 Woodmill Run
Apex, NC 27539

-----Original Message-----
From: gsstark(at)gmail(dot)com [mailto:gsstark(at)gmail(dot)com] On Behalf Of Greg Stark
Sent: Thursday, August 20, 2009 5:50 PM
To: Brian Ceccarelli
Subject: Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .

On Thu, Aug 20, 2009 at 9:24 PM, Brian
Ceccarelli<ceccareb(at)talussoftware(dot)com> wrote:

> The operator equals is case sensitive.
> The operator < and > are case-insensitive.
> This is not consistent.
>
> Most of the time, operator < and > are case-insenstive, until you compare
> the upper and lower cases of the same letter.

This is all down to your collation locale. en_US and most other non-C
locales sort case insensitively. You'll find the Unix command "sort"
and other utilities behave similarly. You can set your lc_collate to C
if you want strict ascii binary byte order. In 8.3 and previous this
was a cluster-wide setting but in 8.4 you can set it per-database.

If you can find any cases that are actually inconsistent (the above
examples are not -- you need to find three values where a < b < c but
a >= c) then this is a serious problem and you should not use that
locale with postgres or it will result in corrupt indexes.
--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2009-08-26 02:18:16 Re: BUG #4999: select 'a' < 'A' is true, but should be false . . .
Previous Message WANGRUNGVICHAISRI, SHIVESH 2009-08-26 01:25:50 Re: BUG #4996: postgres.exe memory consumption keeps going up